date:20200203

Re: [PATCH 3/3] aspeed/smc: Fix number of dummy cycles for FAST_READ_4 command

2020-02-03 Thread Cédric Le Goater

On 2/3/20 7:09 PM, Guenter Roeck wrote:
> The Linux kernel recently started using FAST_READ_4 commands.
> This results in flash read failures. At the same time, the m25p80
> emulation is seen to read 8 more bytes than expected. Adjusting the
> expected number of dummy cycles to match FAST_READ fixes the problem.

Which machine are you using for these tests ? the AST2500 evb using
the w25q256 flash model ? 

Any how, it looks correct. 

Reviewed-by: Cédric Le Goater 
Fixes: f95c4bffdc4c ("aspeed/smc: snoop SPI transfers to fake dummy cycles")

I think commit ef06ca3946e2 ("xilinx_spips: Add support for RX discard 
and RX drain") needs a similar fix. Adding Francisco.

Thanks,

C. 


> Signed-off-by: Guenter Roeck 
> ---
>  hw/ssi/aspeed_smc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
> index f0c7bbbad3..61e8fa57d3 100644
> --- a/hw/ssi/aspeed_smc.c
> +++ b/hw/ssi/aspeed_smc.c
> @@ -762,11 +762,11 @@ static int aspeed_smc_num_dummies(uint8_t command)
>  case FAST_READ:
>  case DOR:
>  case QOR:
> +case FAST_READ_4:
>  case DOR_4:
>  case QOR_4:
>  return 1;
>  case DIOR:
> -case FAST_READ_4:
>  case DIOR_4:
>  return 2;
>  case QIOR:
>

Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass

2020-02-03 Thread Igor Mammedov

On Mon, 3 Feb 2020 15:49:31 -0600
Babu Moger  wrote:

> On 2/3/20 9:17 AM, Igor Mammedov wrote:
> > On Wed, 29 Jan 2020 10:17:11 -0600
> > Babu Moger  wrote:
> >   
> >> On 1/29/20 3:14 AM, Igor Mammedov wrote:  
> >>> On Tue, 28 Jan 2020 13:45:31 -0600
> >>> Babu Moger  wrote:
> >>> 
>  On 1/28/20 10:29 AM, Igor Mammedov wrote:
> > On Tue, 03 Dec 2019 18:37:42 -0600
> > Babu Moger  wrote:
> >   
> >> Add a new function init_apicid_fn in MachineClass to initialize the 
> >> mode
> >> specific handlers to decode the apic ids.
> >>
> >> Signed-off-by: Babu Moger 
> >> ---
> >>  include/hw/boards.h |1 +
> >>  vl.c|3 +++
> >>  2 files changed, 4 insertions(+)
> >>
> >> diff --git a/include/hw/boards.h b/include/hw/boards.h
> >> index d4fab218e6..ce5aa365cb 100644
> >> --- a/include/hw/boards.h
> >> +++ b/include/hw/boards.h
> >> @@ -238,6 +238,7 @@ struct MachineClass {
> >>   unsigned 
> >> cpu_index);
> >>  const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState 
> >> *machine);
> >>  int64_t (*get_default_cpu_node_id)(const MachineState *ms, int 
> >> idx);
> >> +void (*init_apicid_fn)(MachineState *ms);  
> > it's x86 specific, so why it wasn put into PCMachineClass?  
> 
>  Yes. It is x86 specific for now. I tried to make it generic function so
>  other OSes can use it if required(like we have done in
>  possible_cpu_arch_ids). It initializes functions required to build the
>  apicid for each CPUs. We need these functions much early in the
>  initialization. It should be initialized before parse_numa_opts or
>  machine_run_board_init(in v1.c) which are called from generic context. We
>  cannot use PCMachineClass at this time.
> >>>
> >>> could you point to specific patches in this series that require
> >>> apic ids being initialized before parse_numa_opts and elaborate why?
> >>>
> >>> we already have possible_cpu_arch_ids() which could be called very
> >>> early and calculates APIC IDs in x86 case, so why not reuse it?
> >>
> >>
> >> The current code(before this series) parses the numa information and then
> >> sequentially builds the apicid. Both are done together.
> >>
> >> But this series separates the numa parsing and apicid generation. Numa
> >> parsing is done first and after that the apicid is generated. Reason is we
> >> need to know the number of numa nodes in advance to decode the apicid.
> >>
> >> Look at this patch.
> >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F157541988471.46157.6587693720990965800.stgit%40naples-babu.amd.com%2F&data=02%7C01%7Cbabu.moger%40amd.com%7C0a643dd978f149acf9d108d7a8bc487a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163398941923379&sdata=sP2TnNaqNXRGEeQNhJMna3wyeBqN0XbNKqgsCTVDaOQ%3D&reserved=0
> >>
> >> static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo 
> >> *topo_info,
> >> +  const X86CPUTopoIDs
> >> *topo_ids)
> >> +{
> >> +return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
> >> +   (topo_ids->llc_id << apicid_llc_offset_epyc(topo_info)) |
> >> +   (topo_ids->die_id  << apicid_die_offset(topo_info)) |
> >> +   (topo_ids->core_id << apicid_core_offset(topo_info)) |
> >> +   topo_ids->smt_id;
> >> +}
> >>
> >>
> >> The function apicid_from_topo_ids_epyc builds the apicid. New decode adds
> >> llc_id(which is numa id here) to the current decoding. Other fields are
> >> mostly remains same.  
> > 
> > If llc_id is the same as numa id, why not reuse 
> > CpuInstanceProperties::node-id
> > instead of llc_id you are adding in previous patch 6/18?
> >   
> I tried to use that earlier. But dropped the idea as it required some
> changes. Don't remember exactly now. I am going to investigate again if we
> can use the node_id for our purpose here. Will let you know if I have any
> issues.
The reason I'm asking to not add new properties here is that it
expands interface visible/used by management tools and it's maintenance
burden not only on QEMU but on engagement side as well. So if yo can reuse
node-id, it will work out of box with existing users.

It should also be less confusing for us since we don't have to keep in mind
(or figure out) that llc_id is the same as node id and wonder why the later
wasn't used in the first place.

Re: [PULL 3/3] target/mips: Separate FPU-related helpers into their own file

2020-02-03 Thread Philippe Mathieu-Daudé


On 2/4/20 7:42 AM, Aleksandar Markovic wrote:

From: Aleksandar Markovic 

For clarity and easier maintenence, create target/mips/fpu_helper.c, and
move all FPU-related content form target/mips/op_helper.c to that file.

Signed-off-by: Aleksandar Markovic 
Reviewed-by: Aleksandar Rikalo 
Message-Id: <1580745443-24650-3-git-send-email-aleksandar.marko...@rt-rk.com>
---
  target/mips/Makefile.objs |2 +-
  target/mips/fpu_helper.c  | 1911 +
  target/mips/op_helper.c   | 1877 
  3 files changed, 1912 insertions(+), 1878 deletions(-)
  create mode 100644 target/mips/fpu_helper.c

diff --git a/target/mips/Makefile.objs b/target/mips/Makefile.objs
index 3ca2bde..91eb691 100644
--- a/target/mips/Makefile.objs
+++ b/target/mips/Makefile.objs
@@ -1,5 +1,5 @@
  obj-y += translate.o cpu.o gdbstub.o helper.o
-obj-y += op_helper.o cp0_helper.o
+obj-y += op_helper.o cp0_helper.o fpu_helper.o
  obj-y += dsp_helper.o lmi_helper.o msa_helper.o
  obj-$(CONFIG_SOFTMMU) += mips-semi.o
  obj-$(CONFIG_SOFTMMU) += machine.o cp0_timer.o
diff --git a/target/mips/fpu_helper.c b/target/mips/fpu_helper.c
new file mode 100644
index 000..0d5769e
--- /dev/null
+++ b/target/mips/fpu_helper.c
@@ -0,0 +1,1911 @@
+/*
+ *  Helpers for emulation of CP0-related MIPS instructions.


Isn't it "FPU"?


+ *
+ *  Copyright (C) 2004-2005  Jocelyn Mayer
+ *  Copyright (C) 2020  Wave Computing, Inc.
+ *  Copyright (C) 2020  Aleksandar Markovic 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "cpu.h"
+#include "internal.h"
+#include "qemu/host-utils.h"
+#include "exec/helper-proto.h"
+#include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "exec/memop.h"
+#include "sysemu/kvm.h"
+#include "fpu/softfloat.h"

Re: [PATCH v2] pl031: add finalize function to avoid memleaks

2020-02-03 Thread Philippe Mathieu-Daudé


On 2/4/20 3:05 AM, pannengy...@huawei.com wrote:

From: Pan Nengyuan 

There is a memory leak when we call 'device_list_properties' with
typename = pl031. It's easy to reproduce as follow:

   virsh qemu-monitor-command vm1 --pretty '{"execute": "device-list-properties", "arguments": 
{"typename": "pl031"}}'

The memory leak stack:
   Direct leak of 48 byte(s) in 1 object(s) allocated from:
 #0 0x7f6e0925a970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
 #1 0x7f6e06f4d49d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
 #2 0x564a0f7654ea in timer_new_full /mnt/sdb/qemu/include/qemu/timer.h:530
 #3 0x564a0f76555d in timer_new /mnt/sdb/qemu/include/qemu/timer.h:551
 #4 0x564a0f765589 in timer_new_ns /mnt/sdb/qemu/include/qemu/timer.h:569
 #5 0x564a0f76747d in pl031_init /mnt/sdb/qemu/hw/rtc/pl031.c:198
 #6 0x564a0fd4a19d in object_init_with_type /mnt/sdb/qemu/qom/object.c:360
 #7 0x564a0fd4b166 in object_initialize_with_type 
/mnt/sdb/qemu/qom/object.c:467
 #8 0x564a0fd4c8e6 in object_new_with_type /mnt/sdb/qemu/qom/object.c:636
 #9 0x564a0fd4c98e in object_new /mnt/sdb/qemu/qom/object.c:646
 #10 0x564a0fc69d43 in qmp_device_list_properties 
/mnt/sdb/qemu/qom/qom-qmp-cmds.c:204
 #11 0x564a0ef18e64 in qdev_device_help /mnt/sdb/qemu/qdev-monitor.c:278

Reported-by: Euler Robot 
Signed-off-by: Pan Nengyuan 
---
Changes V2 to V1:
- Delay the timer_new until realize instead of putting it into instance_init, 
since the pl031 can't be hotplugged(suggested by Peter Maydell).
---
  hw/rtc/pl031.c | 5 +
  1 file changed, 5 insertions(+)

diff --git a/hw/rtc/pl031.c b/hw/rtc/pl031.c
index ae47f09635..0b9253eb30 100644
--- a/hw/rtc/pl031.c
+++ b/hw/rtc/pl031.c
@@ -190,7 +190,11 @@ static void pl031_init(Object *obj)
  qemu_get_timedate(&tm, 0);
  s->tick_offset = mktimegm(&tm) -
  qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
+}
  
+static void pl031_realize(DeviceState *dev, Error **errp)

+{
+PL031State *s = PL031(dev);
  s->timer = timer_new_ns(rtc_clock, pl031_interrupt, s);
  }
  
@@ -321,6 +325,7 @@ static void pl031_class_init(ObjectClass *klass, void *data)

  DeviceClass *dc = DEVICE_CLASS(klass);
  
  dc->vmsd = &vmstate_pl031;

+dc->realize = pl031_realize;
  device_class_set_props(dc, pl031_properties);
  }
  



Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 1/3] m25p80: Convert to support tracing

2020-02-03 Thread Cédric Le Goater

On 2/3/20 7:09 PM, Guenter Roeck wrote:
> While at it, add some trace messages to help debug problems
> seen when running the latest Linux kernel.
> 
> Signed-off-by: Guenter Roeck 


Reviewed-by: Cédric Le Goater 

We have been chasing a bug for years on the witherspoon-bmc machine 
using UBIfs. It will be useful. 

What kind of issue are you looking at ? 

Thanks,

C. 

> ---
>  hw/block/m25p80.c | 48 ---
>  hw/block/trace-events | 16 +++
>  2 files changed, 38 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> index 11ff5b9ad7..63e050d7d3 100644
> --- a/hw/block/m25p80.c
> +++ b/hw/block/m25p80.c
> @@ -32,17 +32,7 @@
>  #include "qemu/module.h"
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
> -
> -#ifndef M25P80_ERR_DEBUG
> -#define M25P80_ERR_DEBUG 0
> -#endif
> -
> -#define DB_PRINT_L(level, ...) do { \
> -if (M25P80_ERR_DEBUG > (level)) { \
> -fprintf(stderr,  ": %s: ", __func__); \
> -fprintf(stderr, ## __VA_ARGS__); \
> -} \
> -} while (0)
> +#include "trace.h"
>  
>  /* Fields for FlashPartInfo->flags */
>  
> @@ -574,7 +564,8 @@ static void flash_erase(Flash *s, int offset, FlashCMD 
> cmd)
>  abort();
>  }
>  
> -DB_PRINT_L(0, "offset = %#x, len = %d\n", offset, len);
> +trace_m25p80_flash_erase(offset, len);
> +
>  if ((s->pi->flags & capa_to_assert) != capa_to_assert) {
>  qemu_log_mask(LOG_GUEST_ERROR, "M25P80: %d erase size not supported 
> by"
>" device\n", len);
> @@ -607,8 +598,7 @@ void flash_write8(Flash *s, uint32_t addr, uint8_t data)
>  }
>  
>  if ((prev ^ data) & data) {
> -DB_PRINT_L(1, "programming zero to one! addr=%" PRIx32 "  %" PRIx8
> -   " -> %" PRIx8 "\n", addr, prev, data);
> +trace_m25p80_programming_zero_to_one(addr, prev, data);
>  }
>  
>  if (s->pi->flags & EEPROM) {
> @@ -662,6 +652,9 @@ static void complete_collecting_data(Flash *s)
>  
>  s->state = STATE_IDLE;
>  
> +trace_m25p80_complete_collecting(s->cmd_in_progress, n, s->ear,
> + s->cur_addr);
> +
>  switch (s->cmd_in_progress) {
>  case DPP:
>  case QPP:
> @@ -825,7 +818,7 @@ static void reset_memory(Flash *s)
>  break;
>  }
>  
> -DB_PRINT_L(0, "Reset done.\n");
> +trace_m25p80_reset_done();
>  }
>  
>  static void decode_fast_read_cmd(Flash *s)
> @@ -941,9 +934,10 @@ static void decode_qio_read_cmd(Flash *s)
>  
>  static void decode_new_cmd(Flash *s, uint32_t value)
>  {
> -s->cmd_in_progress = value;
>  int i;
> -DB_PRINT_L(0, "decoded new command:%x\n", value);
> +
> +s->cmd_in_progress = value;
> +trace_m25p80_command_decoded(value);
>  
>  if (value != RESET_MEMORY) {
>  s->reset_enable = false;
> @@ -1042,7 +1036,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
>  break;
>  
>  case JEDEC_READ:
> -DB_PRINT_L(0, "populated jedec code\n");
> +trace_m25p80_populated_jedec();
>  for (i = 0; i < s->pi->id_len; i++) {
>  s->data[i] = s->pi->id[i];
>  }
> @@ -1063,7 +1057,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
>  case BULK_ERASE_60:
>  case BULK_ERASE:
>  if (s->write_enable) {
> -DB_PRINT_L(0, "chip erase\n");
> +trace_m25p80_chip_erase();
>  flash_erase(s, 0, BULK_ERASE);
>  } else {
>  qemu_log_mask(LOG_GUEST_ERROR, "M25P80: chip erase with write "
> @@ -1184,7 +1178,7 @@ static int m25p80_cs(SSISlave *ss, bool select)
>  s->data_read_loop = false;
>  }
>  
> -DB_PRINT_L(0, "%sselect\n", select ? "de" : "");
> +trace_m25p80_select(select ? "de" : "");
>  
>  return 0;
>  }
> @@ -1194,19 +1188,20 @@ static uint32_t m25p80_transfer8(SSISlave *ss, 
> uint32_t tx)
>  Flash *s = M25P80(ss);
>  uint32_t r = 0;
>  
> +trace_m25p80_transfer(s->state, s->len, s->needed_bytes, s->pos,
> +  s->cur_addr, (uint8_t)tx);
> +
>  switch (s->state) {
>  
>  case STATE_PAGE_PROGRAM:
> -DB_PRINT_L(1, "page program cur_addr=%#" PRIx32 " data=%" PRIx8 "\n",
> -   s->cur_addr, (uint8_t)tx);
> +trace_m25p80_page_program(s->cur_addr, (uint8_t)tx);
>  flash_write8(s, s->cur_addr, (uint8_t)tx);
>  s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
>  break;
>  
>  case STATE_READ:
>  r = s->storage[s->cur_addr];
> -DB_PRINT_L(1, "READ 0x%" PRIx32 "=%" PRIx8 "\n", s->cur_addr,
> -   (uint8_t)r);
> +trace_m25p80_read_byte(s->cur_addr, (uint8_t)r);
>  s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
>  break;
>  
> @@ -1244,6 +1239,7 @@ static uint32_t m25p80_transfer8(SSISlave *ss, uint32_t 
> tx)
>  }
>  
>  r = s->data[s->pos];
> +trace_m

Re: [PULL 0/3] MIPS queue for February 4th, 2020

2020-02-03 Thread no-reply

Patchew URL: 
https://patchew.org/QEMU/1580798552-703-1-git-send-email-aleksandar.marko...@rt-rk.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PULL 0/3] MIPS queue for February 4th, 2020
Message-id: 1580798552-703-1-git-send-email-aleksandar.marko...@rt-rk.com
Type: series

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/1580798552-703-1-git-send-email-aleksandar.marko...@rt-rk.com -> 
patchew/1580798552-703-1-git-send-email-aleksandar.marko...@rt-rk.com
Switched to a new branch 'test'
2cb0be8 target/mips: Separate FPU-related helpers into their own file
efaacfa target/mips: Separate CP0-related helpers into their own file
4abf8c1 target/mips: Fix ll/sc after 7dd547e5ab6b31e7a0cfc182d3ad131dd55a948f

=== OUTPUT BEGIN ===
1/3 Checking commit 4abf8c12a48e (target/mips: Fix ll/sc after 
7dd547e5ab6b31e7a0cfc182d3ad131dd55a948f)
2/3 Checking commit efaacfa4747f (target/mips: Separate CP0-related helpers 
into their own file)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#29: 
new file mode 100644

ERROR: space prohibited after that '&' (ctx:WxW)
#202: FILE: target/mips/cp0_helper.c:169:
+tcu = (v >> CP0TCSt_TCU0) & 0xf;
   ^

ERROR: space prohibited after that '&' (ctx:WxW)
#203: FILE: target/mips/cp0_helper.c:170:
+tmx = (v >> CP0TCSt_TMX) & 0x1;
  ^

ERROR: space prohibited after that '&' (ctx:WxW)
#205: FILE: target/mips/cp0_helper.c:172:
+tksu = (v >> CP0TCSt_TKSU) & 0x3;
^

ERROR: space prohibited after that '&' (ctx:WxW)
#1678: FILE: target/mips/cp0_helper.c:1645:
+if (!((env->CP0_VPControl >> CP0VPCtl_DIS) & 1)) {
^

ERROR: space prohibited after that '&' (ctx:WxW)
#1696: FILE: target/mips/cp0_helper.c:1663:
+if ((env->CP0_VPControl >> CP0VPCtl_DIS) & 1) {
  ^

total: 5 errors, 1 warnings, 3358 lines checked

Patch 2/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

3/3 Checking commit 2cb0be8a910d (target/mips: Separate FPU-related helpers 
into their own file)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#27: 
new file mode 100644

ERROR: spaces required around that '*' (ctx:WxV)
#1164: FILE: target/mips/fpu_helper.c:1133:
+  float_status *status)  \
^

total: 1 errors, 1 warnings, 3806 lines checked

Patch 3/3 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1580798552-703-1-git-send-email-aleksandar.marko...@rt-rk.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [RFC PATCH] audio: proper support for float samples in mixeng

2020-02-03 Thread Markus Armbruster

Eric Blake  writes:

> On 2/3/20 12:21 AM, Markus Armbruster wrote:
>> "Kővágó, Zoltán"  writes:
>>
>>> This adds proper support for float samples in mixeng by adding a new
>>> audio format for it.
>>>
>>> Limitations: only native endianness is supported.
>>>
>>> Signed-off-by: Kővágó, Zoltán 
>>> ---
>>>
>>> This patch is meant to be applied on top of "[PATCH] coreaudio: fix 
>>> coreaudio
>>> playback" by Volker Rümelin, available at:
>>> https://lists.nongnu.org/archive/html/qemu-devel/2020-02/msg00114.html
>>>
>>> For more information, please refer to that thread.
>>>
>>> ---
>
>>> +++ b/qapi/audio.json
>>> @@ -276,7 +276,7 @@
>>>   # Since: 4.0
>>>   ##
>>>   { 'enum': 'AudioFormat',
>>> -  'data': [ 'u8', 's8', 'u16', 's16', 'u32', 's32' ] }
>>> +  'data': [ 'u8', 's8', 'u16', 's16', 'u32', 's32', 'f32' ] }
>>> ##
>>>   # @AudiodevDriver:
>>
>> For QAPI:
>> Acked-by: Markus Armbruster 
>
> Is it worth a comment update mentioning that 'f32' is '(since 5.0)'?

Good point; we routinely do that.

Should look like this:

##
# @AudioFormat:
#
# An enumeration of possible audio formats.
#
# @u8: lorem
# @s8: ipsum
# @u16: dolor
# @s16: sit
# @u32: amet
# @s32: consectetur
# @f32: adipisici (since 5.0)
#
# Since: 4.0
##

The generator does not enforce documentation of enum values.

[PULL 0/3] MIPS queue for February 4th, 2020

2020-02-03 Thread Aleksandar Markovic

From: Aleksandar Markovic 

The following changes since commit f31160c7d1b89cfb4dd4001a23575b42141cb0ec:

  Merge remote-tracking branch 'remotes/pmaydell/tags/pull-docs-20200203' into 
staging (2020-02-03 11:14:24 +)

are available in the git repository at:

  https://github.com/AMarkovic/qemu tags/mips-queue-feb-04-2020

for you to fetch changes up to 78e91b612eb746c7916cce3ea91f709b916b007c:

  target/mips: Separate FPU-related helpers into their own file (2020-02-03 
23:55:53 +0100)



MIPS queue for February 4th, 2020

  Content:

- fix for a recent regression in LL/SC
- mechanical reorganization of files containing helpers

  Note:

- six checkpatch errors and two warnings are benign and should be
  ignored



Aleksandar Markovic (2):
  target/mips: Separate CP0-related helpers into their own file
  target/mips: Separate FPU-related helpers into their own file

Alex Richardson (1):
  target/mips: Fix ll/sc after 7dd547e5ab6b31e7a0cfc182d3ad131dd55a948f

 target/mips/Makefile.objs |5 +-
 target/mips/cp0_helper.c  | 1678 +
 target/mips/fpu_helper.c  | 1911 
 target/mips/op_helper.c   | 4422 +
 4 files changed, 4044 insertions(+), 3972 deletions(-)
 create mode 100644 target/mips/cp0_helper.c
 create mode 100644 target/mips/fpu_helper.c

-- 
2.7.4

[PULL 2/3] target/mips: Separate CP0-related helpers into their own file

2020-02-03 Thread Aleksandar Markovic

From: Aleksandar Markovic 

For clarity and easier maintenence, create target/mips/cp0_helper.c, and
move all CP0-related content form target/mips/op_helper.c to that file.

Signed-off-by: Aleksandar Markovic 
Reviewed-by: Aleksandar Rikalo 
Message-Id: <1580745443-24650-2-git-send-email-aleksandar.marko...@rt-rk.com>
---
 target/mips/Makefile.objs |5 +-
 target/mips/cp0_helper.c  | 1678 
 target/mips/op_helper.c   | 1705 +
 3 files changed, 1713 insertions(+), 1675 deletions(-)
 create mode 100644 target/mips/cp0_helper.c

diff --git a/target/mips/Makefile.objs b/target/mips/Makefile.objs
index 3448ad5..3ca2bde 100644
--- a/target/mips/Makefile.objs
+++ b/target/mips/Makefile.objs
@@ -1,5 +1,6 @@
-obj-y += translate.o dsp_helper.o op_helper.o lmi_helper.o helper.o cpu.o
-obj-y += gdbstub.o msa_helper.o
+obj-y += translate.o cpu.o gdbstub.o helper.o
+obj-y += op_helper.o cp0_helper.o
+obj-y += dsp_helper.o lmi_helper.o msa_helper.o
 obj-$(CONFIG_SOFTMMU) += mips-semi.o
 obj-$(CONFIG_SOFTMMU) += machine.o cp0_timer.o
 obj-$(CONFIG_KVM) += kvm.o
diff --git a/target/mips/cp0_helper.c b/target/mips/cp0_helper.c
new file mode 100644
index 000..bbf12e4
--- /dev/null
+++ b/target/mips/cp0_helper.c
@@ -0,0 +1,1678 @@
+/*
+ *  Helpers for emulation of CP0-related MIPS instructions.
+ *
+ *  Copyright (C) 2004-2005  Jocelyn Mayer
+ *  Copyright (C) 2020  Wave Computing, Inc.
+ *  Copyright (C) 2020  Aleksandar Markovic 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "cpu.h"
+#include "internal.h"
+#include "qemu/host-utils.h"
+#include "exec/helper-proto.h"
+#include "exec/exec-all.h"
+#include "exec/cpu_ldst.h"
+#include "exec/memop.h"
+#include "sysemu/kvm.h"
+
+
+#ifndef CONFIG_USER_ONLY
+/* SMP helpers.  */
+static bool mips_vpe_is_wfi(MIPSCPU *c)
+{
+CPUState *cpu = CPU(c);
+CPUMIPSState *env = &c->env;
+
+/*
+ * If the VPE is halted but otherwise active, it means it's waiting for
+ * an interrupt.\
+ */
+return cpu->halted && mips_vpe_active(env);
+}
+
+static bool mips_vp_is_wfi(MIPSCPU *c)
+{
+CPUState *cpu = CPU(c);
+CPUMIPSState *env = &c->env;
+
+return cpu->halted && mips_vp_active(env);
+}
+
+static inline void mips_vpe_wake(MIPSCPU *c)
+{
+/*
+ * Don't set ->halted = 0 directly, let it be done via cpu_has_work
+ * because there might be other conditions that state that c should
+ * be sleeping.
+ */
+qemu_mutex_lock_iothread();
+cpu_interrupt(CPU(c), CPU_INTERRUPT_WAKE);
+qemu_mutex_unlock_iothread();
+}
+
+static inline void mips_vpe_sleep(MIPSCPU *cpu)
+{
+CPUState *cs = CPU(cpu);
+
+/*
+ * The VPE was shut off, really go to bed.
+ * Reset any old _WAKE requests.
+ */
+cs->halted = 1;
+cpu_reset_interrupt(cs, CPU_INTERRUPT_WAKE);
+}
+
+static inline void mips_tc_wake(MIPSCPU *cpu, int tc)
+{
+CPUMIPSState *c = &cpu->env;
+
+/* FIXME: TC reschedule.  */
+if (mips_vpe_active(c) && !mips_vpe_is_wfi(cpu)) {
+mips_vpe_wake(cpu);
+}
+}
+
+static inline void mips_tc_sleep(MIPSCPU *cpu, int tc)
+{
+CPUMIPSState *c = &cpu->env;
+
+/* FIXME: TC reschedule.  */
+if (!mips_vpe_active(c)) {
+mips_vpe_sleep(cpu);
+}
+}
+
+/**
+ * mips_cpu_map_tc:
+ * @env: CPU from which mapping is performed.
+ * @tc: Should point to an int with the value of the global TC index.
+ *
+ * This function will transform @tc into a local index within the
+ * returned #CPUMIPSState.
+ */
+
+/*
+ * FIXME: This code assumes that all VPEs have the same number of TCs,
+ *which depends on runtime setup. Can probably be fixed by
+ *walking the list of CPUMIPSStates.
+ */
+static CPUMIPSState *mips_cpu_map_tc(CPUMIPSState *env, int *tc)
+{
+MIPSCPU *cpu;
+CPUState *cs;
+CPUState *other_cs;
+int vpe_idx;
+int tc_idx = *tc;
+
+if (!(env->CP0_VPEConf0 & (1 << CP0VPEC0_MVP))) {
+/* Not allowed to address other CPUs.  */
+*tc = env->current_tc;
+return env;
+}
+
+cs = env_cpu(env);
+vpe_idx = tc_idx / cs->nr_threads;
+*tc = tc_idx % cs->nr_threads;
+other_cs = qemu_get_cpu(vpe_idx);
+if (other_cs == NULL) {
+return env;

[PULL 1/3] target/mips: Fix ll/sc after 7dd547e5ab6b31e7a0cfc182d3ad131dd55a948f

2020-02-03 Thread Aleksandar Markovic

From: Alex Richardson 

After 7dd547e5ab6b31e7a0cfc182d3ad131dd55a948f the env->llval value is
loaded as an unsigned value (instead of sign-extended as before).
Therefore, the CMPXCHG in gen_st_cond() in translate.c fails if the sign
bit is set in the loaded value.
Fix this by sign-extending the llval value for the 32-bit case.

I discovered this issue because FreeBSD MIPS64 was looping forever in an
atomic helper function when trying to start /sbin/init.

Signed-off-by: Alex Richardson 
Fixes: 7dd547e5ab6b ("target/mips: Use cpu_*_mmuidx_ra instead of 
MMU_MODE*_SUFFIX")
Buglink: https://bugs.launchpad.net/qemu/+bug/1861605
Cc: Aurelien Jarno 
Cc: Aleksandar Markovic 
Cc: Aleksandar Rikalo 
Cc: Richard Henderson 
Signed-off-by: James Clarke 
Signed-off-by: Aleksandar Markovic 
Reviewed-by: Richard Henderson 
Tested-by: Philippe Mathieu-Daudé 
Message-Id: <20200202153409.28534-1-jrt...@jrtc27.com>
---
 target/mips/op_helper.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/target/mips/op_helper.c b/target/mips/op_helper.c
index 15d05a5..467914d 100644
--- a/target/mips/op_helper.c
+++ b/target/mips/op_helper.c
@@ -305,7 +305,7 @@ static inline hwaddr do_translate_address(CPUMIPSState *env,
 }
 }
 
-#define HELPER_LD_ATOMIC(name, insn, almask)  \
+#define HELPER_LD_ATOMIC(name, insn, almask, do_cast) \
 target_ulong helper_##name(CPUMIPSState *env, target_ulong arg, int mem_idx)  \
 { \
 if (arg & almask) {   \
@@ -316,12 +316,12 @@ target_ulong helper_##name(CPUMIPSState *env, 
target_ulong arg, int mem_idx)  \
 } \
 env->CP0_LLAddr = do_translate_address(env, arg, 0, GETPC()); \
 env->lladdr = arg;\
-env->llval = cpu_##insn##_mmuidx_ra(env, arg, mem_idx, GETPC());  \
+env->llval = do_cast cpu_##insn##_mmuidx_ra(env, arg, mem_idx, GETPC());  \
 return env->llval;\
 }
-HELPER_LD_ATOMIC(ll, ldl, 0x3)
+HELPER_LD_ATOMIC(ll, ldl, 0x3, (target_long)(int32_t))
 #ifdef TARGET_MIPS64
-HELPER_LD_ATOMIC(lld, ldq, 0x7)
+HELPER_LD_ATOMIC(lld, ldq, 0x7, (target_ulong))
 #endif
 #undef HELPER_LD_ATOMIC
 #endif
-- 
2.7.4

Re: [PATCH v3 3/7] arm/virt/acpi: remove _ADR from devices identified by _HID

2020-02-03 Thread Michael S. Tsirkin

On Tue, Feb 04, 2020 at 09:43:21AM +0800, Heyi Guo wrote:
> According to ACPI spec, _ADR should be used for device on a bus that
> has a standard enumeration algorithm, but not for device which is on
> system bus and must be enumerated by OSPM. And it is not recommended
> to contain both _HID and _ADR in a single device.
> 
> See ACPI 6.3, section 6.1, top of page 343:
> 
> A device object must contain either an _HID object or an _ADR object,
> but should not contain both.
> 
> (https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf)
> 
> Signed-off-by: Heyi Guo 
> Acked-by: Igor Mammedov 
> Acked-by: Michael S. Tsirkin 


Reviewed-by: Michael S. Tsirkin 

> ---
> Cc: Shannon Zhao 
> Cc: Peter Maydell 
> Cc: "Michael S. Tsirkin" 
> Cc: Igor Mammedov 
> Cc: qemu-...@nongnu.org
> Cc: qemu-devel@nongnu.org
> ---
>  hw/arm/virt-acpi-build.c | 8 
>  1 file changed, 8 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 9f4c7d1889..be752c0ad8 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -78,11 +78,6 @@ static void acpi_dsdt_add_uart(Aml *scope, const 
> MemMapEntry *uart_memmap,
>   AML_EXCLUSIVE, &uart_irq, 1));
>  aml_append(dev, aml_name_decl("_CRS", crs));
>  
> -/* The _ADR entry is used to link this device to the UART described
> - * in the SPCR table, i.e. SPCR.base_address.address == _ADR.
> - */
> -aml_append(dev, aml_name_decl("_ADR", aml_int(uart_memmap->base)));
> -
>  aml_append(scope, dev);
>  }
>  
> @@ -170,7 +165,6 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
> MemMapEntry *memmap,
>  aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
>  aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
>  aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
> -aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>  aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
>  aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
>  aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
> @@ -334,7 +328,6 @@ static void acpi_dsdt_add_gpio(Aml *scope, const 
> MemMapEntry *gpio_memmap,
>  {
>  Aml *dev = aml_device("GPO0");
>  aml_append(dev, aml_name_decl("_HID", aml_string("ARMH0061")));
> -aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>  aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>  
>  Aml *crs = aml_resource_template();
> @@ -364,7 +357,6 @@ static void acpi_dsdt_add_power_button(Aml *scope)
>  {
>  Aml *dev = aml_device(ACPI_POWER_BUTTON_DEVICE);
>  aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C0C")));
> -aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
>  aml_append(dev, aml_name_decl("_UID", aml_int(0)));
>  aml_append(scope, dev);
>  }
> -- 
> 2.19.1

Re: [PATCH v3 2/7] arm/virt/acpi: remove meaningless sub device "RP0" from PCI0

2020-02-03 Thread Michael S. Tsirkin

On Tue, Feb 04, 2020 at 09:43:20AM +0800, Heyi Guo wrote:
> The sub device "RP0" under PCI0 in ACPI/DSDT does not contain any
> method or property other than "_ADR", so it is safe to remove it.
> 
> Signed-off-by: Heyi Guo 
> Acked-by: "Michael S. Tsirkin" 


Reviewed-by: Michael S. Tsirkin 

> ---
> Cc: Peter Maydell 
> Cc: "Michael S. Tsirkin" 
> Cc: Igor Mammedov 
> Cc: Shannon Zhao 
> Cc: qemu-...@nongnu.org
> Cc: qemu-devel@nongnu.org
> ---
>  hw/arm/virt-acpi-build.c | 4 
>  1 file changed, 4 deletions(-)
> 
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index bd5f771e9b..9f4c7d1889 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -317,10 +317,6 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
> MemMapEntry *memmap,
>  aml_append(method, aml_return(buf));
>  aml_append(dev, method);
>  
> -Aml *dev_rp0 = aml_device("%s", "RP0");
> -aml_append(dev_rp0, aml_name_decl("_ADR", aml_int(0)));
> -aml_append(dev, dev_rp0);
> -
>  Aml *dev_res0 = aml_device("%s", "RES0");
>  aml_append(dev_res0, aml_name_decl("_HID", aml_string("PNP0C02")));
>  crs = aml_resource_template();
> -- 
> 2.19.1

Re: VW ELF loader

2020-02-03 Thread Thomas Huth

On 04/02/2020 00.26, Paolo Bonzini wrote:
> 
> 
> Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy  > ha scritto:
> 
> Speaking seriously, what would I put into the guest?
> 
> Only things that would be considered drivers. Ignore the partitions
> issue for now so that you can just pass the device tree services to QEMU
> with hypercalls.
> 
> Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> smaller but adhoc with only a couple of people knowing it.
> 
> 
> You can generalize and reuse the s390 code. All you have to write is the
> PCI scan and virtio-pci setup.

Well, for netbooting, the s390-ccw bios uses the libnet code from SLOF,
so re-using this for a slim netboot client on ppc64 would certainly be
feasible (especially since there are also already virtio drivers in SLOF
that are written in C), but I think it is not very future proof. The
libnet from SLOF only supports UDP, and no TCP. So for advanced boot
scenarios like booting from HTTP or even HTTPS, you need something else
(i.e. maybe grub is the better option, indeed).

 Thomas

[PATCH] migration: Optimization about wait-unplug migration state

2020-02-03 Thread Keqian Zhu

qemu_savevm_nr_failover_devices() is originally designed to
get the number of failover devices, but it actually returns
the number of "unplug-pending" failover devices now. Moreover,
what drives migration state to wait-unplug should be the number
of "unplug-pending" failover devices, not all failover devices.

We can also notice that qemu_savevm_state_guest_unplug_pending()
and qemu_savevm_nr_failover_devices() is equivalent almost (from
the code view). So the latter is incorrect semantically and
useless, just delete it.

In the qemu_savevm_state_guest_unplug_pending(), once hit a
unplug-pending failover device, then it can return true right
now to save cpu time.

Signed-off-by: Keqian Zhu 
---
Cc: jfreim...@redhat.com
Cc: Juan Quintela 
Cc: "Dr. David Alan Gilbert" 
---
 migration/migration.c |  2 +-
 migration/savevm.c| 24 +++-
 migration/savevm.h|  1 -
 3 files changed, 4 insertions(+), 23 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 3a21a4686c..deedc968cf 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -,7 +,7 @@ static void *migration_thread(void *opaque)
 
 qemu_savevm_state_setup(s->to_dst_file);
 
-if (qemu_savevm_nr_failover_devices()) {
+if (qemu_savevm_state_guest_unplug_pending()) {
 migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
   MIGRATION_STATUS_WAIT_UNPLUG);
 
diff --git a/migration/savevm.c b/migration/savevm.c
index f19cb9ec7a..1d4220ece8 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1140,36 +1140,18 @@ void qemu_savevm_state_header(QEMUFile *f)
 }
 }
 
-int qemu_savevm_nr_failover_devices(void)
+bool qemu_savevm_state_guest_unplug_pending(void)
 {
 SaveStateEntry *se;
-int n = 0;
 
 QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
 if (se->vmsd && se->vmsd->dev_unplug_pending &&
 se->vmsd->dev_unplug_pending(se->opaque)) {
-n++;
-}
-}
-
-return n;
-}
-
-bool qemu_savevm_state_guest_unplug_pending(void)
-{
-SaveStateEntry *se;
-int n = 0;
-
-QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
-if (!se->vmsd || !se->vmsd->dev_unplug_pending) {
-continue;
-}
-if (se->vmsd->dev_unplug_pending(se->opaque)) {
-n++;
+return true;
 }
 }
 
-return n > 0;
+return false;
 }
 
 void qemu_savevm_state_setup(QEMUFile *f)
diff --git a/migration/savevm.h b/migration/savevm.h
index c42b9c80ee..ba64a7e271 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -31,7 +31,6 @@
 
 bool qemu_savevm_state_blocked(Error **errp);
 void qemu_savevm_state_setup(QEMUFile *f);
-int qemu_savevm_nr_failover_devices(void);
 bool qemu_savevm_state_guest_unplug_pending(void);
 int qemu_savevm_state_resume_prepare(MigrationState *s);
 void qemu_savevm_state_header(QEMUFile *f);
-- 
2.19.1

Re: [PATCH v5 4/4] spapr: Add Hcalls to support PAPR NVDIMM device

2020-02-03 Thread David Gibson

On Thu, Jan 30, 2020 at 05:48:28AM -0600, Shivaprasad G Bhat wrote:
> This patch implements few of the necessary hcalls for the nvdimm support.
> 
> PAPR semantics is such that each NVDIMM device is comprising of multiple
> SCM(Storage Class Memory) blocks. The guest requests the hypervisor to
> bind each of the SCM blocks of the NVDIMM device using hcalls. There can
> be SCM block unbind requests in case of driver errors or unplug(not
> supported now) use cases. The NVDIMM label read/writes are done through
> hcalls.
> 
> Since each virtual NVDIMM device is divided into multiple SCM blocks,
> the bind, unbind, and queries using hcalls on those blocks can come
> independently. This doesn't fit well into the qemu device semantics,
> where the map/unmap are done at the (whole)device/object level granularity.
> The patch doesnt actually bind/unbind on hcalls but let it happen at the
> device_add/del phase itself instead.
> 
> The guest kernel makes bind/unbind requests for the virtual NVDIMM device
> at the region level granularity. Without interleaving, each virtual NVDIMM
> device is presented as a separate guest physical address range. So, there
> is no way a partial bind/unbind request can come for the vNVDIMM in a
> hcall for a subset of SCM blocks of a virtual NVDIMM. Hence it is safe to
> do bind/unbind everything during the device_add/del.
> 
> Signed-off-by: Shivaprasad G Bhat 

LGTM, apart from some minor nits noted below.

> ---
>  hw/ppc/Makefile.objs   |2 
>  hw/ppc/spapr_nvdimm.c  |  327 
> 
>  include/hw/ppc/spapr.h |8 +
>  3 files changed, 335 insertions(+), 2 deletions(-)
>  create mode 100644 hw/ppc/spapr_nvdimm.c
> 
> diff --git a/hw/ppc/Makefile.objs b/hw/ppc/Makefile.objs
> index a4bac57be6..c3d3cc56eb 100644
> --- a/hw/ppc/Makefile.objs
> +++ b/hw/ppc/Makefile.objs
> @@ -7,7 +7,7 @@ obj-$(CONFIG_PSERIES) += spapr.o spapr_caps.o spapr_vio.o 
> spapr_events.o
>  obj-$(CONFIG_PSERIES) += spapr_hcall.o spapr_iommu.o spapr_rtas.o
>  obj-$(CONFIG_PSERIES) += spapr_pci.o spapr_rtc.o spapr_drc.o
>  obj-$(CONFIG_PSERIES) += spapr_cpu_core.o spapr_ovec.o spapr_irq.o
> -obj-$(CONFIG_PSERIES) += spapr_tpm_proxy.o
> +obj-$(CONFIG_PSERIES) += spapr_tpm_proxy.o spapr_nvdimm.o
>  obj-$(CONFIG_SPAPR_RNG) +=  spapr_rng.o
>  obj-$(call land,$(CONFIG_PSERIES),$(CONFIG_LINUX)) += spapr_pci_vfio.o 
> spapr_pci_nvlink2.o
>  # IBM PowerNV
> diff --git a/hw/ppc/spapr_nvdimm.c b/hw/ppc/spapr_nvdimm.c
> new file mode 100644
> index 00..8d1c2dc009
> --- /dev/null
> +++ b/hw/ppc/spapr_nvdimm.c

It'd be nice to introduce this file in the previous patch and try to
keep as much of the NVDIMM code together, rather than bloating spapr.c
even further.

> @@ -0,0 +1,327 @@
> +/*
> + * QEMU PAPR Storage Class Memory Interfaces
> + *
> + * Copyright (c) 2019-2020, IBM Corporation.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a 
> copy
> + * of this software and associated documentation files (the "Software"), to 
> deal
> + * in the Software without restriction, including without limitation the 
> rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
> FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "hw/ppc/spapr.h"
> +#include "hw/ppc/spapr_drc.h"
> +#include "hw/mem/nvdimm.h"
> +#include "qemu/range.h"
> +#include "qemu/nvdimm-utils.h"
> +
> +static target_ulong h_scm_read_metadata(PowerPCCPU *cpu,
> +SpaprMachineState *spapr,
> +target_ulong opcode,
> +target_ulong *args)
> +{
> +uint32_t drc_index = args[0];
> +uint64_t offset = args[1];
> +uint64_t numBytesToRead = args[2];

That's a really long name for a local.  How about just 'size' or 'len'?

> +SpaprDrc *drc = spapr_drc_by_index(drc_index);
> +NVDIMMDevice *nvdimm;
> +NVDIMMClass *ddc;
> +uint64_t data = 0;
> +uint8_t buf[8] = { 0 };
> +
> +if (!drc || !drc->dev ||
> +spapr_drc_type(drc) != SPAPR_DR_CONNECTOR_TYPE_PMEM) {
> +return H_PARAMETER;
> +}
> +
> +

Re: [PATCH v5 3/4] spapr: Add NVDIMM device support

2020-02-03 Thread David Gibson

On Thu, Jan 30, 2020 at 05:48:15AM -0600, Shivaprasad G Bhat wrote:
> Add support for NVDIMM devices for sPAPR. Piggyback on existing nvdimm
> device interface in QEMU to support virtual NVDIMM devices for Power.
> Create the required DT entries for the device (some entries have
> dummy values right now).
> 
> The patch creates the required DT node and sends a hotplug
> interrupt to the guest. Guest is expected to undertake the normal
> DR resource add path in response and start issuing PAPR SCM hcalls.
> 
> The device support is verified based on the machine version unlike x86.
> 
> This is how it can be used ..
> Ex :
> For coldplug, the device to be added in qemu command line as shown below
> -object 
> memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/tmp/nvdimm0,share=yes,size=1073872896
> -device 
> nvdimm,label-size=128k,uuid=75a3cdd7-6a2f-4791-8d15-fe0a920e8e9e,memdev=memnvdimm0,id=nvdimm0,slot=0
> 
> For hotplug, the device to be added from monitor as below
> object_add 
> memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/tmp/nvdimm0,share=yes,size=1073872896
> device_add 
> nvdimm,label-size=128k,uuid=75a3cdd7-6a2f-4791-8d15-fe0a920e8e9e,memdev=memnvdimm0,id=nvdimm0,slot=0
> 
> Signed-off-by: Shivaprasad G Bhat 
> Signed-off-by: Bharata B Rao 
>[Early implementation]

Looking pretty good now.  Just a few minor things to address now,
noted below.

> ---
>  default-configs/ppc64-softmmu.mak |1 
>  hw/mem/Kconfig|2 
>  hw/ppc/spapr.c|  212 
> +++--
>  hw/ppc/spapr_drc.c|   18 +++
>  hw/ppc/spapr_events.c |4 +
>  include/hw/ppc/spapr.h|   11 ++
>  include/hw/ppc/spapr_drc.h|9 ++
>  7 files changed, 243 insertions(+), 14 deletions(-)
> 
> diff --git a/default-configs/ppc64-softmmu.mak 
> b/default-configs/ppc64-softmmu.mak
> index cca52665d9..ae0841fa3a 100644
> --- a/default-configs/ppc64-softmmu.mak
> +++ b/default-configs/ppc64-softmmu.mak
> @@ -8,3 +8,4 @@ CONFIG_POWERNV=y
>  
>  # For pSeries
>  CONFIG_PSERIES=y
> +CONFIG_NVDIMM=y
> diff --git a/hw/mem/Kconfig b/hw/mem/Kconfig
> index 620fd4cb59..2ad052a536 100644
> --- a/hw/mem/Kconfig
> +++ b/hw/mem/Kconfig
> @@ -8,4 +8,4 @@ config MEM_DEVICE
>  config NVDIMM
>  bool
>  default y
> -depends on PC
> +depends on (PC || PSERIES)
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 02cf53fc5b..4ea73c31fe 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -79,6 +79,8 @@
>  #include "hw/ppc/spapr_cpu_core.h"
>  #include "hw/mem/memory-device.h"
>  #include "hw/ppc/spapr_tpm_proxy.h"
> +#include "hw/mem/nvdimm.h"
> +#include "qemu/nvdimm-utils.h"
>  
>  #include "monitor/monitor.h"
>  
> @@ -684,12 +686,22 @@ static int spapr_populate_drmem_v2(SpaprMachineState 
> *spapr, void *fdt,
>  nr_entries++;
>  }
>  
> -/* Entry for DIMM */
>  drc = spapr_drc_by_id(TYPE_SPAPR_DRC_LMB, addr / lmb_size);
>  g_assert(drc);
> -elem = spapr_get_drconf_cell(size / lmb_size, addr,
> - spapr_drc_index(drc), node,
> - SPAPR_LMB_FLAGS_ASSIGNED);
> +
> +if (info->value->type == MEMORY_DEVICE_INFO_KIND_DIMM) {
> +/* Entry for DIMM */
> +elem = spapr_get_drconf_cell(size / lmb_size, addr,
> + spapr_drc_index(drc), node,
> + SPAPR_LMB_FLAGS_ASSIGNED);
> +} else if (info->value->type == MEMORY_DEVICE_INFO_KIND_NVDIMM) {
> +/*
> + * Entry for the NVDIMM occupied area. The area is
> + * hotpluggable after the NVDIMM is unplugged.
> + */
> +elem = spapr_get_drconf_cell(size / lmb_size, addr,
> + spapr_drc_index(drc), -1, 0);
> +}

Rather than having a separate case here, it should work to simply
'continue' the loop on NVDIMM entries.  Then the code above this to
insert unassigned DRCs for memory ranges that don't have (regular)
memory in them yet should already do what you need.

>  QSIMPLEQ_INSERT_TAIL(&drconf_queue, elem, entry);
>  nr_entries++;
>  cur_addr = addr + size;
> @@ -1130,6 +1142,85 @@ static void spapr_dt_hypervisor(SpaprMachineState 
> *spapr, void *fdt)
>  }
>  }
>  
> +static int spapr_dt_nvdimm(void *fdt, int parent_offset,
> +   NVDIMMDevice *nvdimm)
> +{
> +int child_offset;
> +char buf[40];

Use g_strdup_printf() rather than fixed sized buffers, please.

> +SpaprDrc *drc;
> +uint32_t drc_idx;
> +uint32_t node = object_property_get_uint(OBJECT(nvdimm), 
> PC_DIMM_NODE_PROP,
> + &error_abort);
> +uint64_t slot = object_property_get_uint(OBJECT(nvdimm), 
> PC_DIMM_SLOT_PROP,
> +

Re: [PATCH] e1000e: Avoid hw_error if legacy mode used

2020-02-03 Thread Jason Wang




On 2020/1/28 上午12:03, Yuri Benditovich wrote:

https://bugzilla.redhat.com/show_bug.cgi?id=1787142
The emulation issues hw_error if PSRCTL register
is written, for example, with zero value.
Such configuration does not present any problem when
DTYP bits of RCTL register define legacy format of
transfer descriptors. Current commit discards check
for BSIZE0 and BSIZE1 when legacy mode used.

Signed-off-by: Yuri Benditovich 
---
  hw/net/e1000e_core.c | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 5b05c8ea8a..94ea34dca5 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -2813,12 +2813,15 @@ e1000e_set_eitr(E1000ECore *core, int index, uint32_t 
val)
  static void
  e1000e_set_psrctl(E1000ECore *core, int index, uint32_t val)
  {
-if ((val & E1000_PSRCTL_BSIZE0_MASK) == 0) {
-hw_error("e1000e: PSRCTL.BSIZE0 cannot be zero");
-}
+if (core->mac[RCTL] & E1000_RCTL_DTYP_MASK) {
+
+if ((val & E1000_PSRCTL_BSIZE0_MASK) == 0) {
+hw_error("e1000e: PSRCTL.BSIZE0 cannot be zero");
+}
  
-if ((val & E1000_PSRCTL_BSIZE1_MASK) == 0) {

-hw_error("e1000e: PSRCTL.BSIZE1 cannot be zero");
+if ((val & E1000_PSRCTL_BSIZE1_MASK) == 0) {
+hw_error("e1000e: PSRCTL.BSIZE1 cannot be zero");
+}
  }
  
  core->mac[PSRCTL] = val;



Applied.

Thanks

Re: [PATCH v4 00/14] Fixes for DP8393X SONIC device emulation

2020-02-03 Thread Jason Wang




On 2020/1/29 下午5:27, Finn Thain wrote:

Hi All,

There are bugs in the emulated dp8393x device that can stop packet
reception in a Linux/m68k guest (q800 machine).

With a Linux/m68k v5.5 guest (q800), it's possible to remotely trigger
an Oops by sending ping floods.

With a Linux/mips guest (magnum machine), the driver fails to probe
the dp8393x device.

With a NetBSD/arc 5.1 guest (magnum), the bugs in the device can be
fatal to the guest kernel.

Whilst debugging the device, I found that the receiver algorithm
differs from the one described in the National Semiconductor
datasheet.

This patch series resolves these bugs.

AFAIK, all bugs in the Linux sonic driver were fixed in Linux v5.5.
---
Changed since v1:
  - Minor revisions as described beneath commit logs.
  - Dropped patches 4/10 and 7/10.
  - Added 5 new patches.

Changed since v2:
  - Minor revisions as described beneath commit logs.
  - Dropped patch 13/13.
  - Added 2 new patches.

Changed since v3:
  - Replaced patch 13/14 with patch suggested by Philippe Mathieu-Daudé.


Finn Thain (14):
   dp8393x: Mask EOL bit from descriptor addresses
   dp8393x: Always use 32-bit accesses
   dp8393x: Clean up endianness hacks
   dp8393x: Have dp8393x_receive() return the packet size
   dp8393x: Update LLFA and CRDA registers from rx descriptor
   dp8393x: Clear RRRA command register bit only when appropriate
   dp8393x: Implement packet size limit and RBAE interrupt
   dp8393x: Don't clobber packet checksum
   dp8393x: Use long-word-aligned RRA pointers in 32-bit mode
   dp8393x: Pad frames to word or long word boundary
   dp8393x: Clear descriptor in_use field to release packet
   dp8393x: Always update RRA pointers and sequence numbers
   dp8393x: Don't reset Silicon Revision register
   dp8393x: Don't stop reception upon RBE interrupt assertion

  hw/net/dp8393x.c | 202 +++
  1 file changed, 134 insertions(+), 68 deletions(-)



Applied.

Thanks

Re: [RFC PATCH] hw/arm/virt: Support NMI injection

2020-02-03 Thread Gavin Shan


On 1/31/20 8:39 PM, Marc Zyngier wrote:

On 2020-01-31 06:59, Gavin Shan wrote:

On 1/29/20 8:04 PM, Marc Zyngier wrote:

On 2020-01-29 02:44, Alexey Kardashevskiy wrote:

On 28/01/2020 17:48, Gavin Shan wrote:

but a NMI is injected
through LAPIC on x86. So I'm not sure what architect (system reset on
ppc or injecting NMI on x86) aarch64 should follow.


I'd say whatever triggers in-kernel debugger or kdump but I am not
familiar with ARM at all :)


All that is completely OS specific, and has no relation to the architecture.
As I mentioned in another part of the thread, the closest thing to this
would be to implement SDEI together with an IMPDEF mechanism to enter it
(or even generate a RAS error).

On the other hand, SDEI is pretty horrible, and means either KVM or QEMU
acting like a firmware for the guest. To say that I'm not keen is a massive
understatement.

 M.


Marc, could you please explain a bit about "IMPDEF mechanism"? I'm not sure if
it means a non-standard SDEI event should be used, corresponding to the HMP/QMP
"nmi" command.


SDEI doesn't quite describe *why* and *how* you enter it. You just get there by
some mean (SError, Group-0 interrupt, or IMPlementation DEFined mechanism).
It is then for the SDEI implementation to sort it out and enter the OS using the
registered entry point.



Thanks for the explanation, which make things much clearer.


Also, If I'm correct, you agree that a crash dump should be triggered on arm64
guest once HMP/QMP "nmi" command is issued?


No, I don't agree. QEMU and KVM are OS agnostic, and there is nothing in the 
ARMv8
architecture that talks about "crash dumps".  If your "nmi" command generates
a SDEI event and that event gets dispatched to the guest, it is entirely the 
guest's
responsibility to do whatever it wants. We should stay clear of assuming 
behaviours.



Ok. Thank you for your clarification.


I also dig into SDEI a bit. It seems the SDEI support in QEMU isn't upstream 
yet:

https://patchew.org/QEMU/20191105091056.9541-1-guoh...@huawei.com/


And I'm glad. SDEI, as I said, is absolutely horrible. I'm also very fortunate
to have been CC'd on this series on an email address I cannot read.
This would have huge impacts on both QEMU and KVM, and this needs more than
a knee jerk reaction to support a QEMU command.

And to be honest, if what you require is the guest kernel to panic, why don't
you just implement a QEMU-specific driver in Linux that does just that?
Some kind of watchdog driver, maybe?



Marc, sorry for the delay and didn't come to you in time because I wanted to 
figure
out the mechanism, which helps to get similar output as x86/ppc: NMI (or reset 
exception)
is received as indication to errors, possibly panic and restart the guest 
kernel.

The mechanism I figured out is to inject SError to guest, as below snippet 
shows. It
helps to get a panic and guest rebooting, which looks similar to what x86/ppc 
have.
I can post the patch as RFC if it's right direction to go :)

Note: I'm still investigating the code to see how SError can be injected when 
TCG
is used. I think we need same function when TCG is enabled, or it's something 
for
future.

static void do_inject_nmi_kvm(CPUState *cpu, Error **errp)
{
struct kvm_vcpu_events events;
int ret;

 :
memset(&events, 0, sizeof(events));
events.exception.serror_pending = 1;
ret = kvm_vcpu_ioctl(CPU(cpu), KVM_SET_VCPU_EVENTS, &events);
 :
}

# echo 1 > /proc/sys/kernel/panic
# (qemu) nmi
[  812.510613] SError Interrupt on CPU0, code 0xbf00 -- SError
[  812.510617] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.5.0-rc2-00187-gf72202430e30 #2
[  812.510617] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
[  812.510618] pstate: 40400085 (nZcv daIf +PAN -UAO)
[  812.510619] pc : cpu_do_idle+0x48/0x58
[  812.510619] lr : arch_cpu_idle+0x30/0x238
[  812.510620] sp : 8000112c3e80
[  812.510620] pmr_save: 00f0
[  812.510621] x29: 8000112c3e80 x28: 40e10018
[  812.510622] x27: 00033e50d340 x26: 
[  812.510623] x25:  x24: 8000112ca21c
[  812.510624] x23: 800010f98cb8 x22: 8000112c98c8
[  812.510625] x21: 8000112ca1f8 x20: 0001
[  812.510626] x19: 800010f86018 x18: 
[  812.510627] x17:  x16: 
[  812.510628] x15:  x14: 
[  812.510629] x13:  x12: 0002fe640100
[  812.510630] x11:  x10: 09f0
[  812.510631] x9 : 800010088928 x8 : 8000112d28d0
[  812.510632] x7 : 8002ed63a000 x6 : 002fe2092876
[  812.510633] x5 : 00ff x4 : 8002ed63a000
[  812.510634] x3 : 1bce x2 : 00f0
[  812.510635] x1 :  x0 : 0060
[  812.510636] Kernel panic - not syncing: Asynchronous SError Interrupt
[  812.510637] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.5.0-rc2-00187-gf72202430e30 #2
[

Re: [PATCH 0/4] migration: Replace gemu_log with qemu_log

2020-02-03 Thread Josh Kunz

On Mon, Jan 20, 2020 at 3:36 AM Alex Bennée  wrote:
> Ahh the default build target for the BSDs is "check" but as bsd-user
> doesn't have any checks it doesn't end up building. You can force it
> with
>
>   make vm-build-netbsd EXTRA_CONFIGURE_OPTS="--disable-system" 
> BUILD_TARGET="all"
>
> It would be worth plumbing in the tests/tcg tests at some point. I
> suspect most of the user-mode tests are more POSIX than Linux.

Neat, thanks Alex! I've run this and verified the BSD user-mode binaries built.

Re: [PATCH v2 1/4] linux-user: Use `qemu_log' for non-strace logging

2020-02-03 Thread Josh Kunz

I've switched it to a LOG_UNIMP, similar to to the one several lines
below. I will follow up with a change to switch this to an assert as
recommended.


On Tue, Jan 28, 2020 at 9:07 AM Laurent Vivier  wrote:
>
> Le 28/01/2020 à 17:53, Alex Bennée a écrit :
> >
> > Laurent Vivier  writes:
> >
> >> Le 17/01/2020 à 20:28, Josh Kunz a écrit :
> >>> Since most calls to `gemu_log` are actually logging unimplemented 
> >>> features,
> >>> this change replaces most non-strace calls to `gemu_log` with calls to
> >>> `qemu_log_mask(LOG_UNIMP, ...)`.  This allows the user to easily log to
> >>> a file, and to mask out these log messages if they desire.
> >>>
> >>> Note: This change is slightly backwards incompatible, since now these
> >>> "unimplemented" log messages will not be logged by default.
> >>
> >> This is a good incompatibility as these messages were unexpected by  the
> >> tools catching stderr. They don't happen on "real" systems.
> >>
> >> ...
> >>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> >>> index 249e4b95fc..629f3a21b5 100644
> >>> --- a/linux-user/syscall.c
> >>> +++ b/linux-user/syscall.c
> >>> @@ -1545,20 +1545,18 @@ static inline abi_long target_to_host_cmsg(struct 
> >>> msghdr *msgh,
> >>>  - sizeof(struct target_cmsghdr);
> >>>
> >>>  space += CMSG_SPACE(len);
> >>> -if (space > msgh->msg_controllen) {
> >>> -space -= CMSG_SPACE(len);
> >>> -/* This is a QEMU bug, since we allocated the payload
> >>> - * area ourselves (unlike overflow in host-to-target
> >>> - * conversion, which is just the guest giving us a buffer
> >>> - * that's too small). It can't happen for the payload types
> >>> - * we currently support; if it becomes an issue in future
> >>> - * we would need to improve our allocation strategy to
> >>> - * something more intelligent than "twice the size of the
> >>> - * target buffer we're reading from".
> >>> - */
> >>> -gemu_log("Host cmsg overflow\n");
> >>> -break;
> >>> -}
> >>> +
> >>> +/*
> >>> + * This is a QEMU bug, since we allocated the payload
> >>> + * area ourselves (unlike overflow in host-to-target
> >>> + * conversion, which is just the guest giving us a buffer
> >>> + * that's too small). It can't happen for the payload types
> >>> + * we currently support; if it becomes an issue in future
> >>> + * we would need to improve our allocation strategy to
> >>> + * something more intelligent than "twice the size of the
> >>> + * target buffer we're reading from".
> >>> + */
> >>> +assert(space > msgh->msg_controllen && "Host cmsg overflow");
>
> Should it be in fact :
>
>   assert(space <= msgh->msg_controllen && "Host cmsg overflow");
>
> >>>  if (tswap32(target_cmsg->cmsg_level) == TARGET_SOL_SOCKET) {
> >>>  cmsg->cmsg_level = SOL_SOCKET;
> >>
> >> Could you move this to a separate patch: you are not using qemu_log()
> >> here and I'm not convinced that crashing is better than ignoring the
> >> remaining part of the buffer.
> >
> > I suggested it should be an assert in the first place. It certainly
> > makes sense to keep it in a separate patch though. I guess you could
> > argue for:
> >
> >   qemu_log_mask(LOG_UNIMP, "%s: unhandled message size");
> >
> > but is it really better to partially work and continue? It seems like
> > you would get more subtle hidden bugs.
>
> ok, you're right. crash seems to be a better solution.
>
> So, we only need to move this change to a separate patch.
>
> Thanks,
> Laurent
>

[PATCH v3 4/4] bsd-user: Replace gemu_log with qemu_log

2020-02-03 Thread Josh Kunz

gemu_log is an old logging mechanism used to implement strace logging in
the bsd-user tree. It logs directly to stderr and cannot easily be
redirected. This change instead causes strace to log via the qemu_log
subsystem which has fine-grained logging control, and a centralized
mechanism for log redirection. bsd-user does not currently implement any
logging redirection options, or log masking options, but this change
brings it more in line with the linux-user tree.

Signed-off-by: Josh Kunz 
---
 bsd-user/main.c| 29 +
 bsd-user/qemu.h|  2 --
 bsd-user/strace.c  | 32 +++-
 bsd-user/syscall.c | 31 +++
 4 files changed, 47 insertions(+), 47 deletions(-)

diff --git a/bsd-user/main.c b/bsd-user/main.c
index 770c2b267a..d024ac067f 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -55,15 +55,6 @@ enum BSDType bsd_type;
by remapping the process stack directly at the right place */
 unsigned long x86_stack_size = 512 * 1024;
 
-void gemu_log(const char *fmt, ...)
-{
-va_list ap;
-
-va_start(ap, fmt);
-vfprintf(stderr, fmt, ap);
-va_end(ap);
-}
-
 #if defined(TARGET_I386)
 int cpu_get_pic_interrupt(CPUX86State *env)
 {
@@ -731,6 +722,7 @@ int main(int argc, char **argv)
 const char *cpu_type;
 const char *log_file = NULL;
 const char *log_mask = NULL;
+bool enable_strace = false;
 struct target_pt_regs regs1, *regs = ®s1;
 struct image_info info1, *info = &info1;
 TaskState ts1, *ts = &ts1;
@@ -845,7 +837,7 @@ int main(int argc, char **argv)
 } else if (!strcmp(r, "singlestep")) {
 singlestep = 1;
 } else if (!strcmp(r, "strace")) {
-do_strace = 1;
+enable_strace = true;
 } else if (!strcmp(r, "trace")) {
 g_free(trace_file);
 trace_file = trace_opt_parse(optarg);
@@ -854,17 +846,26 @@ int main(int argc, char **argv)
 }
 }
 
+if (getenv("QEMU_STRACE")) {
+enable_strace = true;
+}
+
 /* init debug */
 qemu_log_needs_buffers();
 qemu_set_log_filename(log_file, &error_fatal);
-if (log_mask) {
+if (log_mask || enable_strace) {
 int mask;
 
 mask = qemu_str_to_log_mask(log_mask);
-if (!mask) {
+if (log_mask && !mask) {
 qemu_print_log_usage(stdout);
 exit(1);
 }
+
+if (enable_strace) {
+mask |= LOG_STRACE;
+}
+
 qemu_set_log(mask);
 }
 
@@ -916,10 +917,6 @@ int main(int argc, char **argv)
 #endif
 thread_cpu = cpu;
 
-if (getenv("QEMU_STRACE")) {
-do_strace = 1;
-}
-
 target_environ = envlist_to_environ(envlist, NULL);
 envlist_free(envlist);
 
diff --git a/bsd-user/qemu.h b/bsd-user/qemu.h
index 09e8aed9c7..5762e3a6e5 100644
--- a/bsd-user/qemu.h
+++ b/bsd-user/qemu.h
@@ -152,7 +152,6 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, abi_long 
arg1,
 abi_long do_openbsd_syscall(void *cpu_env, int num, abi_long arg1,
 abi_long arg2, abi_long arg3, abi_long arg4,
 abi_long arg5, abi_long arg6);
-void gemu_log(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 extern THREAD CPUState *thread_cpu;
 void cpu_loop(CPUArchState *env);
 char *target_strerror(int err);
@@ -188,7 +187,6 @@ print_openbsd_syscall(int num,
   abi_long arg1, abi_long arg2, abi_long arg3,
   abi_long arg4, abi_long arg5, abi_long arg6);
 void print_openbsd_syscall_ret(int num, abi_long ret);
-extern int do_strace;
 
 /* signal.c */
 void process_pending_signals(CPUArchState *cpu_env);
diff --git a/bsd-user/strace.c b/bsd-user/strace.c
index fa66fe1ee2..6ee1510555 100644
--- a/bsd-user/strace.c
+++ b/bsd-user/strace.c
@@ -23,8 +23,6 @@
 
 #include "qemu.h"
 
-int do_strace;
-
 /*
  * Utility functions
  */
@@ -36,17 +34,17 @@ static void print_sysctl(const struct syscallname *name, 
abi_long arg1,
 uint32_t i;
 int32_t *namep;
 
-gemu_log("%s({ ", name->name);
+qemu_log("%s({ ", name->name);
 namep = lock_user(VERIFY_READ, arg1, sizeof(int32_t) * arg2, 1);
 if (namep) {
 int32_t *p = namep;
 
 for (i = 0; i < (uint32_t)arg2; i++) {
-gemu_log("%d ", tswap32(*p++));
+qemu_log("%d ", tswap32(*p++));
 }
 unlock_user(namep, arg1, 0);
 }
-gemu_log("}, %u, 0x" TARGET_ABI_FMT_lx ", 0x" TARGET_ABI_FMT_lx ", 0x"
+qemu_log("}, %u, 0x" TARGET_ABI_FMT_lx ", 0x" TARGET_ABI_FMT_lx ", 0x"
 TARGET_ABI_FMT_lx ", 0x" TARGET_ABI_FMT_lx ")",
 (uint32_t)arg2, arg3, arg4, arg5, arg6);
 }
@@ -62,7 +60,7 @@ static void print_execve(const struct syscallname *name, 
abi_long arg1,
 if (s == NULL) {
 return;
 }
-gemu_log("%s(\"%s\",{", name->name, s);
+qemu_log("%s(\"%s\",{", name->name, s);
 unlock_user(s, arg1, 0);
 
 for (arg_ptr_addr = arg2; ; arg_p

[PATCH v3 3/4] linux-user: remove gemu_log from the linux-user tree

2020-02-03 Thread Josh Kunz

Now that all uses have been migrated to `qemu_log' it is no longer
needed.

Reviewed-by: Laurent Vivier 
Signed-off-by: Josh Kunz 
---
 linux-user/main.c | 9 -
 linux-user/qemu.h | 1 -
 2 files changed, 10 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 8f1d07cdd6..22578b1633 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -112,15 +112,6 @@ const char *qemu_uname_release;
by remapping the process stack directly at the right place */
 unsigned long guest_stack_size = 8 * 1024 * 1024UL;
 
-void gemu_log(const char *fmt, ...)
-{
-va_list ap;
-
-va_start(ap, fmt);
-vfprintf(stderr, fmt, ap);
-va_end(ap);
-}
-
 #if defined(TARGET_I386)
 int cpu_get_pic_interrupt(CPUX86State *env)
 {
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 2421dc7afd..792c74290f 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -211,7 +211,6 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 abi_long arg2, abi_long arg3, abi_long arg4,
 abi_long arg5, abi_long arg6, abi_long arg7,
 abi_long arg8);
-void gemu_log(const char *fmt, ...) GCC_FMT_ATTR(1, 2);
 extern __thread CPUState *thread_cpu;
 void cpu_loop(CPUArchState *env);
 const char *target_strerror(int err);
-- 
2.25.0.341.g760bfbb309-goog

Re: [PATCH v2 2/4] linux-user: Use `qemu_log' for strace

2020-02-03 Thread Josh Kunz

On Tue, Jan 28, 2020 at 7:07 AM Laurent Vivier  wrote:
>
> Le 17/01/2020 à 20:28, Josh Kunz a écrit :
> > This change switches linux-user strace logging to use the newer `qemu_log`
> > logging subsystem rather than the older `gemu_log` (notice the "g")
> > logger. `qemu_log` has several advantages, namely that it allows logging
> > to a file, and provides a more unified interface for configuration
> > of logging (via the QEMU_LOG environment variable or options).
> >
> > This change introduces a new log mask: `LOG_STRACE` which is used for
> > logging of user-mode strace messages.
> >
> > Signed-off-by: Josh Kunz 
> > ---
> >  include/qemu/log.h   |   2 +
> >  linux-user/main.c|  30 ++-
> >  linux-user/qemu.h|   1 -
> >  linux-user/signal.c  |   2 +-
> >  linux-user/strace.c  | 479 ++-
> >  linux-user/syscall.c |  13 +-
> >  util/log.c   |   2 +
> >  7 files changed, 278 insertions(+), 251 deletions(-)
> >
> ...
> > diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> > index 629f3a21b5..54e60f3807 100644
> > --- a/linux-user/syscall.c
> > +++ b/linux-user/syscall.c
> > @@ -12098,14 +12098,15 @@ abi_long do_syscall(void *cpu_env, int num, 
> > abi_long arg1,
> >  record_syscall_start(cpu, num, arg1,
> >   arg2, arg3, arg4, arg5, arg6, arg7, arg8);
> >
> > -if (unlikely(do_strace)) {
> > +if (unlikely(qemu_loglevel_mask(LOG_STRACE))) {
> >  print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> > -ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
> > -  arg5, arg6, arg7, arg8);
> > +}
> > +
> > +ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
> > +  arg5, arg6, arg7, arg8);
> > +
> > +if (unlikely(qemu_loglevel_mask(LOG_STRACE))) {
> >  print_syscall_ret(num, ret);
> > -} else {
> > -ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
> > -  arg5, arg6, arg7, arg8);
> >  }
> >
> >  record_syscall_return(cpu, num, ret);
>
> In term of performance perhaps it sould be better to only test once for
> the mask as it is done before?

Modern compilers will generate functionally identical sequences for
test once or testing twice (which is to say, they recognize they are
the same compare: https://godbolt.org/z/VyrMHf IMO testing twice is
nicer to read, so I'm leaving it that way for now unless you object.

[PATCH v3 2/4] linux-user: Use `qemu_log' for strace

2020-02-03 Thread Josh Kunz

This change switches linux-user strace logging to use the newer `qemu_log`
logging subsystem rather than the older `gemu_log` (notice the "g")
logger. `qemu_log` has several advantages, namely that it allows logging
to a file, and provides a more unified interface for configuration
of logging (via the QEMU_LOG environment variable or options).

This change introduces a new log mask: `LOG_STRACE` which is used for
logging of user-mode strace messages.

Reviewed-by: Laurent Vivier 
Signed-off-by: Josh Kunz 
---
 include/qemu/log.h   |   2 +
 linux-user/main.c|  30 ++-
 linux-user/qemu.h|   1 -
 linux-user/signal.c  |   2 +-
 linux-user/strace.c  | 479 ++-
 linux-user/syscall.c |  13 +-
 util/log.c   |   2 +
 7 files changed, 278 insertions(+), 251 deletions(-)

diff --git a/include/qemu/log.h b/include/qemu/log.h
index e0f4e40628..f4724f7330 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -62,6 +62,8 @@ static inline bool qemu_log_separate(void)
 #define CPU_LOG_TB_OP_IND  (1 << 16)
 #define CPU_LOG_TB_FPU (1 << 17)
 #define CPU_LOG_PLUGIN (1 << 18)
+/* LOG_STRACE is used for user-mode strace logging. */
+#define LOG_STRACE (1 << 19)
 
 /* Lock output for a series of related logs.  Since this is not needed
  * for a single qemu_log / qemu_log_mask / qemu_log_mask_and_addr, we
diff --git a/linux-user/main.c b/linux-user/main.c
index fba833aac9..8f1d07cdd6 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -60,6 +60,19 @@ unsigned long mmap_min_addr;
 unsigned long guest_base;
 int have_guest_base;
 
+/*
+ * Used to implement backwards-compatibility for the `-strace`, and
+ * QEMU_STRACE options. Without this, the QEMU_LOG can be overwritten by
+ * -strace, or vice versa.
+ */
+static bool enable_strace;
+
+/*
+ * The last log mask given by the user in an environment variable or argument.
+ * Used to support command line arguments overriding environment variables.
+ */
+static int last_log_mask;
+
 /*
  * When running 32-on-64 we should make sure we can fit all of the possible
  * guest address space into a contiguous chunk of virtual host memory.
@@ -223,15 +236,11 @@ static void handle_arg_help(const char *arg)
 
 static void handle_arg_log(const char *arg)
 {
-int mask;
-
-mask = qemu_str_to_log_mask(arg);
-if (!mask) {
+last_log_mask = qemu_str_to_log_mask(arg);
+if (!last_log_mask) {
 qemu_print_log_usage(stdout);
 exit(EXIT_FAILURE);
 }
-qemu_log_needs_buffers();
-qemu_set_log(mask);
 }
 
 static void handle_arg_dfilter(const char *arg)
@@ -375,7 +384,7 @@ static void handle_arg_singlestep(const char *arg)
 
 static void handle_arg_strace(const char *arg)
 {
-do_strace = 1;
+enable_strace = true;
 }
 
 static void handle_arg_version(const char *arg)
@@ -629,6 +638,7 @@ int main(int argc, char **argv, char **envp)
 int i;
 int ret;
 int execfd;
+int log_mask;
 unsigned long max_reserved_va;
 
 error_init(argv[0]);
@@ -661,6 +671,12 @@ int main(int argc, char **argv, char **envp)
 
 optind = parse_args(argc, argv);
 
+log_mask = last_log_mask | (enable_strace ? LOG_STRACE : 0);
+if (log_mask) {
+qemu_log_needs_buffers();
+qemu_set_log(log_mask);
+}
+
 if (!trace_init_backends()) {
 exit(1);
 }
diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index 560a68090e..2421dc7afd 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -386,7 +386,6 @@ void print_syscall_ret(int num, abi_long arg1);
  * --- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
  */
 void print_taken_signal(int target_signum, const target_siginfo_t *tinfo);
-extern int do_strace;
 
 /* signal.c */
 void process_pending_signals(CPUArchState *cpu_env);
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 5ca6d62b15..818d867b7b 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -864,7 +864,7 @@ static void handle_pending_signal(CPUArchState *cpu_env, 
int sig,
 handler = sa->_sa_handler;
 }
 
-if (do_strace) {
+if (unlikely(qemu_loglevel_mask(LOG_STRACE))) {
 print_taken_signal(sig, &k->info);
 }
 
diff --git a/linux-user/strace.c b/linux-user/strace.c
index 3d4d684450..4f7130b2ff 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -12,8 +12,6 @@
 #include 
 #include "qemu.h"
 
-int do_strace=0;
-
 struct syscallname {
 int nr;
 const char *name;
@@ -80,7 +78,7 @@ print_ipc_cmd(int cmd)
 {
 #define output_cmd(val) \
 if( cmd == val ) { \
-gemu_log(#val); \
+qemu_log(#val); \
 return; \
 }
 
@@ -120,7 +118,7 @@ if( cmd == val ) { \
 output_cmd( IPC_RMID );
 
 /* Some value we don't recognize */
-gemu_log("%d",cmd);
+qemu_log("%d", cmd);
 }
 
 static void
@@ -151,7 +149,7 @@ print_signal(abi_ulong arg, int last)
 print_raw_param("%ld", arg, last);
 return;
 }
-gemu_log("%s%s", signal_name, get_c

[PATCH v3 1/4] linux-user: Use `qemu_log' for non-strace logging

2020-02-03 Thread Josh Kunz

Since most calls to `gemu_log` are actually logging unimplemented features,
this change replaces most non-strace calls to `gemu_log` with calls to
`qemu_log_mask(LOG_UNIMP, ...)`.  This allows the user to easily log to
a file, and to mask out these log messages if they desire.

Note: This change is slightly backwards incompatible, since now these
"unimplemented" log messages will not be logged by default.

Signed-off-by: Josh Kunz 
---
 linux-user/arm/cpu_loop.c |  5 ++--
 linux-user/fd-trans.c | 55 +--
 linux-user/syscall.c  | 35 -
 linux-user/vm86.c |  3 ++-
 4 files changed, 62 insertions(+), 36 deletions(-)

diff --git a/linux-user/arm/cpu_loop.c b/linux-user/arm/cpu_loop.c
index 1fae90c6df..cf618daa1c 100644
--- a/linux-user/arm/cpu_loop.c
+++ b/linux-user/arm/cpu_loop.c
@@ -349,8 +349,9 @@ void cpu_loop(CPUARMState *env)
 env->regs[0] = cpu_get_tls(env);
 break;
 default:
-gemu_log("qemu: Unsupported ARM syscall: 0x%x\n",
- n);
+qemu_log_mask(LOG_UNIMP,
+  "qemu: Unsupported ARM syscall: 
0x%x\n",
+  n);
 env->regs[0] = -TARGET_ENOSYS;
 break;
 }
diff --git a/linux-user/fd-trans.c b/linux-user/fd-trans.c
index 9b92386abf..c0687c52e6 100644
--- a/linux-user/fd-trans.c
+++ b/linux-user/fd-trans.c
@@ -514,7 +514,8 @@ static abi_long host_to_target_data_bridge_nlattr(struct 
nlattr *nlattr,
 u32[1] = tswap32(u32[1]); /* optmask */
 break;
 default:
-gemu_log("Unknown QEMU_IFLA_BR type %d\n", nlattr->nla_type);
+qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_BR type %d\n",
+  nlattr->nla_type);
 break;
 }
 return 0;
@@ -577,7 +578,8 @@ static abi_long 
host_to_target_slave_data_bridge_nlattr(struct nlattr *nlattr,
 case QEMU_IFLA_BRPORT_BRIDGE_ID:
 break;
 default:
-gemu_log("Unknown QEMU_IFLA_BRPORT type %d\n", nlattr->nla_type);
+qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_BRPORT type %d\n",
+  nlattr->nla_type);
 break;
 }
 return 0;
@@ -605,7 +607,8 @@ static abi_long host_to_target_data_tun_nlattr(struct 
nlattr *nlattr,
 *u32 = tswap32(*u32);
 break;
 default:
-gemu_log("Unknown QEMU_IFLA_TUN type %d\n", nlattr->nla_type);
+qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_TUN type %d\n",
+  nlattr->nla_type);
 break;
 }
 return 0;
@@ -652,7 +655,8 @@ static abi_long host_to_target_data_linkinfo_nlattr(struct 
nlattr *nlattr,
   NULL,
 
host_to_target_data_tun_nlattr);
 } else {
-gemu_log("Unknown QEMU_IFLA_INFO_KIND %s\n", li_context->name);
+qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_INFO_KIND %s\n",
+  li_context->name);
 }
 break;
 case QEMU_IFLA_INFO_SLAVE_DATA:
@@ -663,12 +667,13 @@ static abi_long 
host_to_target_data_linkinfo_nlattr(struct nlattr *nlattr,
   NULL,

host_to_target_slave_data_bridge_nlattr);
 } else {
-gemu_log("Unknown QEMU_IFLA_INFO_SLAVE_KIND %s\n",
+qemu_log_mask(LOG_UNIMP, "Unknown QEMU_IFLA_INFO_SLAVE_KIND %s\n",
  li_context->slave_name);
 }
 break;
 default:
-gemu_log("Unknown host QEMU_IFLA_INFO type: %d\n", nlattr->nla_type);
+qemu_log_mask(LOG_UNIMP, "Unknown host QEMU_IFLA_INFO type: %d\n",
+  nlattr->nla_type);
 break;
 }
 
@@ -690,7 +695,8 @@ static abi_long host_to_target_data_inet_nlattr(struct 
nlattr *nlattr,
 }
 break;
 default:
-gemu_log("Unknown host AF_INET type: %d\n", nlattr->nla_type);
+qemu_log_mask(LOG_UNIMP, "Unknown host AF_INET type: %d\n",
+  nlattr->nla_type);
 }
 return 0;
 }
@@ -741,7 +747,8 @@ static abi_long host_to_target_data_inet6_nlattr(struct 
nlattr *nlattr,
 }
 break;
 default:
-gemu_log("Unknown host AF_INET6 type: %d\n", nlattr->nla_type);
+qemu_log_mask(LOG_UNIMP, "Unknown host AF_INET6 type: %d\n",
+  nlattr->nla_type);
 }
 return 0;
 }
@@ -759,7 +766,8 @@ static abi_long host_to_target_data_spec_nlattr(struct 
nlattr *nlattr,
   NULL,
  host_to_target_data_inet6_nlattr);
 default:
-gemu_log("Unknown host AF_SPEC type: %d\n", nlattr->

[PATCH v3 0/4] migration: Replace gemu_log with qemu_log

2020-02-03 Thread Josh Kunz

Summary of v2->v3 changes:
  * Removed assert for CMSG handling, replaced with LOG_UNIMP. Will
switch to assert in follow-up patch.
  * Fixed BSD-user build (dangling references to qemu_add_log), and
verified the user-mode build works.

Summary of v1->v2 changes:
  * Removed backwards-compatibility code for non-strace log statements.
  * Removed new qemu_log interface for adding or removing fields from
the log mask.
  * Removed LOG_USER and converted all uses (except one) to LOG_UNIMP.
* One gemu_log statement was converted to an assert.
  * Some style cleanup.

The linux-user and bsd-user trees both widely use a function called
`gemu_log` (notice the 'g') for miscellaneous and strace logging. This
function predates the newer `qemu_log` function, and has a few drawbacks
compared to `qemu_log`:

  1. Always logs to `stderr`, no logging redirection.
  2. "Miscellaneous" logging cannot be disabled, so it may mix with guest
 logging.
  3. Inconsistency with other parts of the QEMU codebase, and a
 confusing name.

The second issue is especially troubling because it can interfere with
programs that expect to communicate via stderr.

This change introduces one new logging masks to the `qemu_log` subsystem
to support its use for user-mode logging: the `LOG_STRACE` mask for
strace-specific logging. Further, it replaces all existing uses of
`gemu_log` with the appropriate `qemu_log_mask(LOG_{UNIMP,STRACE}, ...)`
based on the log message.

Backwards incompatibility:
  * Log messages for unimplemented user-mode features are no longer
logged by default. They have to be enabled by setting the LOG_UNIMP
mask.
  * Log messages for strace/unimplemented user-mode features may be
redirected based on `-D`, instead of always logging to stderr.

Tested:
  * Built with clang 9 and g++ 8.3
  * `make check` run with clang 9 build 
  * Verified:
* QEMU_STRACE/-strace still works for linux-user
  * `make vm-build-netbsd EXTRA_CONFIGURE_OPTS="--disable-system" \
 BUILD_TARGET="all"` passed.

Josh Kunz (4):
  linux-user: Use `qemu_log' for non-strace logging
  linux-user: Use `qemu_log' for strace
  linux-user: remove gemu_log from the linux-user tree
  bsd-user: Replace gemu_log with qemu_log

 bsd-user/main.c   |  29 ++-
 bsd-user/qemu.h   |   2 -
 bsd-user/strace.c |  32 ++-
 bsd-user/syscall.c|  31 ++-
 include/qemu/log.h|   2 +
 linux-user/arm/cpu_loop.c |   5 +-
 linux-user/fd-trans.c |  55 +++--
 linux-user/main.c |  39 ++--
 linux-user/qemu.h |   2 -
 linux-user/signal.c   |   2 +-
 linux-user/strace.c   | 479 +++---
 linux-user/syscall.c  |  48 ++--
 linux-user/vm86.c |   3 +-
 util/log.c|   2 +
 14 files changed, 387 insertions(+), 344 deletions(-)

-- 
2.25.0.341.g760bfbb309-goog

[PATCH v2] pl031: add finalize function to avoid memleaks

2020-02-03 Thread pannengyuan

From: Pan Nengyuan 

There is a memory leak when we call 'device_list_properties' with
typename = pl031. It's easy to reproduce as follow:

  virsh qemu-monitor-command vm1 --pretty '{"execute": 
"device-list-properties", "arguments": {"typename": "pl031"}}'

The memory leak stack:
  Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f6e0925a970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
#1 0x7f6e06f4d49d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
#2 0x564a0f7654ea in timer_new_full /mnt/sdb/qemu/include/qemu/timer.h:530
#3 0x564a0f76555d in timer_new /mnt/sdb/qemu/include/qemu/timer.h:551
#4 0x564a0f765589 in timer_new_ns /mnt/sdb/qemu/include/qemu/timer.h:569
#5 0x564a0f76747d in pl031_init /mnt/sdb/qemu/hw/rtc/pl031.c:198
#6 0x564a0fd4a19d in object_init_with_type /mnt/sdb/qemu/qom/object.c:360
#7 0x564a0fd4b166 in object_initialize_with_type 
/mnt/sdb/qemu/qom/object.c:467
#8 0x564a0fd4c8e6 in object_new_with_type /mnt/sdb/qemu/qom/object.c:636
#9 0x564a0fd4c98e in object_new /mnt/sdb/qemu/qom/object.c:646
#10 0x564a0fc69d43 in qmp_device_list_properties 
/mnt/sdb/qemu/qom/qom-qmp-cmds.c:204
#11 0x564a0ef18e64 in qdev_device_help /mnt/sdb/qemu/qdev-monitor.c:278

Reported-by: Euler Robot 
Signed-off-by: Pan Nengyuan 
---
Changes V2 to V1:
- Delay the timer_new until realize instead of putting it into instance_init, 
since the pl031 can't be hotplugged(suggested by Peter Maydell).
---
 hw/rtc/pl031.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/rtc/pl031.c b/hw/rtc/pl031.c
index ae47f09635..0b9253eb30 100644
--- a/hw/rtc/pl031.c
+++ b/hw/rtc/pl031.c
@@ -190,7 +190,11 @@ static void pl031_init(Object *obj)
 qemu_get_timedate(&tm, 0);
 s->tick_offset = mktimegm(&tm) -
 qemu_clock_get_ns(rtc_clock) / NANOSECONDS_PER_SECOND;
+}
 
+static void pl031_realize(DeviceState *dev, Error **errp)
+{
+PL031State *s = PL031(dev);
 s->timer = timer_new_ns(rtc_clock, pl031_interrupt, s);
 }
 
@@ -321,6 +325,7 @@ static void pl031_class_init(ObjectClass *klass, void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 dc->vmsd = &vmstate_pl031;
+dc->realize = pl031_realize;
 device_class_set_props(dc, pl031_properties);
 }
 
-- 
2.21.0.windows.1

[PATCH v3 1/7] bios-tables-test: prepare to change ARM virt ACPI DSDT

2020-02-03 Thread Heyi Guo

We are going to change ARM virt ACPI DSDT table, which will cause make
check to fail, so temporarily add related golden masters to ignore
list.

Signed-off-by: Heyi Guo 
Reviewed-by: Michael S. Tsirkin 

---
Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Shannon Zhao 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org
---
 tests/qtest/bios-tables-test-allowed-diff.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
b/tests/qtest/bios-tables-test-allowed-diff.h
index dfb8523c8b..32a401ae35 100644
--- a/tests/qtest/bios-tables-test-allowed-diff.h
+++ b/tests/qtest/bios-tables-test-allowed-diff.h
@@ -1 +1,4 @@
 /* List of comma-separated changed AML files to ignore */
+"tests/data/acpi/virt/DSDT",
+"tests/data/acpi/virt/DSDT.memhp",
+"tests/data/acpi/virt/DSDT.numamem",
-- 
2.19.1

[PATCH v3 2/7] arm/virt/acpi: remove meaningless sub device "RP0" from PCI0

2020-02-03 Thread Heyi Guo

The sub device "RP0" under PCI0 in ACPI/DSDT does not contain any
method or property other than "_ADR", so it is safe to remove it.

Signed-off-by: Heyi Guo 
Acked-by: "Michael S. Tsirkin" 

---
Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Shannon Zhao 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org
---
 hw/arm/virt-acpi-build.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index bd5f771e9b..9f4c7d1889 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -317,10 +317,6 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
MemMapEntry *memmap,
 aml_append(method, aml_return(buf));
 aml_append(dev, method);
 
-Aml *dev_rp0 = aml_device("%s", "RP0");
-aml_append(dev_rp0, aml_name_decl("_ADR", aml_int(0)));
-aml_append(dev, dev_rp0);
-
 Aml *dev_res0 = aml_device("%s", "RES0");
 aml_append(dev_res0, aml_name_decl("_HID", aml_string("PNP0C02")));
 crs = aml_resource_template();
-- 
2.19.1

[PATCH v3 3/7] arm/virt/acpi: remove _ADR from devices identified by _HID

2020-02-03 Thread Heyi Guo

According to ACPI spec, _ADR should be used for device on a bus that
has a standard enumeration algorithm, but not for device which is on
system bus and must be enumerated by OSPM. And it is not recommended
to contain both _HID and _ADR in a single device.

See ACPI 6.3, section 6.1, top of page 343:

A device object must contain either an _HID object or an _ADR object,
but should not contain both.

(https://uefi.org/sites/default/files/resources/ACPI_6_3_May16.pdf)

Signed-off-by: Heyi Guo 
Acked-by: Igor Mammedov 
Acked-by: Michael S. Tsirkin 

---
Cc: Shannon Zhao 
Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org
---
 hw/arm/virt-acpi-build.c | 8 
 1 file changed, 8 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 9f4c7d1889..be752c0ad8 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -78,11 +78,6 @@ static void acpi_dsdt_add_uart(Aml *scope, const MemMapEntry 
*uart_memmap,
  AML_EXCLUSIVE, &uart_irq, 1));
 aml_append(dev, aml_name_decl("_CRS", crs));
 
-/* The _ADR entry is used to link this device to the UART described
- * in the SPCR table, i.e. SPCR.base_address.address == _ADR.
- */
-aml_append(dev, aml_name_decl("_ADR", aml_int(uart_memmap->base)));
-
 aml_append(scope, dev);
 }
 
@@ -170,7 +165,6 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry 
*memmap,
 aml_append(dev, aml_name_decl("_CID", aml_string("PNP0A03")));
 aml_append(dev, aml_name_decl("_SEG", aml_int(0)));
 aml_append(dev, aml_name_decl("_BBN", aml_int(0)));
-aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
 aml_append(dev, aml_name_decl("_UID", aml_string("PCI0")));
 aml_append(dev, aml_name_decl("_STR", aml_unicode("PCIe 0 Device")));
 aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
@@ -334,7 +328,6 @@ static void acpi_dsdt_add_gpio(Aml *scope, const 
MemMapEntry *gpio_memmap,
 {
 Aml *dev = aml_device("GPO0");
 aml_append(dev, aml_name_decl("_HID", aml_string("ARMH0061")));
-aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
 aml_append(dev, aml_name_decl("_UID", aml_int(0)));
 
 Aml *crs = aml_resource_template();
@@ -364,7 +357,6 @@ static void acpi_dsdt_add_power_button(Aml *scope)
 {
 Aml *dev = aml_device(ACPI_POWER_BUTTON_DEVICE);
 aml_append(dev, aml_name_decl("_HID", aml_string("PNP0C0C")));
-aml_append(dev, aml_name_decl("_ADR", aml_int(0)));
 aml_append(dev, aml_name_decl("_UID", aml_int(0)));
 aml_append(scope, dev);
 }
-- 
2.19.1

[PATCH v3 6/7] arm/acpi: simplify the description of PCI _CRS

2020-02-03 Thread Heyi Guo

The original code defines a named object for the resource template but
then returns the resource template object itself; the resulted output
is like below:

Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
{
Name (RBUF, ResourceTemplate ()
{
WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
0x, // Granularity
0x, // Range Minimum
0x00FF, // Range Maximum
0x, // Translation Offset
0x0100, // Length
,, )
..
})
Return (ResourceTemplate ()
{
WordBusNumber (ResourceProducer, MinFixed, MaxFixed, PosDecode,
0x, // Granularity
0x, // Range Minimum
0x00FF, // Range Maximum
0x, // Translation Offset
0x0100, // Length
,, )
..
})
}

So the named object "RBUF" is actually useless. The more natural way
is to return RBUF instead, or simply drop RBUF definition.

Choose the latter one to simplify the code.

Signed-off-by: Heyi Guo 
Reviewed-by: Michael S. Tsirkin 

---
Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Shannon Zhao 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org
---
 hw/arm/virt-acpi-build.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f3e340b172..fb4b166f82 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -236,7 +236,6 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry 
*memmap,
  size_mmio_high));
 }
 
-aml_append(method, aml_name_decl("RBUF", rbuf));
 aml_append(method, aml_return(rbuf));
 aml_append(dev, method);
 
-- 
2.19.1

[PATCH v3 4/7] arm/acpi: fix PCI _PRT definition

2020-02-03 Thread Heyi Guo

The address field in each _PRT mapping package should be constructed
with high word for device# and low word for function#, so it is wrong
to use bus_no as the high word. The existing code adds a bunch useless
entries with device #s above 31. Enumerate all possible slots
(i.e. PCI_SLOT_MAX) instead.

Signed-off-by: Heyi Guo 
Reviewed-by: Michael S. Tsirkin 

---
Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Shannon Zhao 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org
---
 hw/arm/virt-acpi-build.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index be752c0ad8..5d157a9dd5 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -151,7 +151,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry 
*memmap,
 {
 int ecam_id = VIRT_ECAM_ID(highmem_ecam);
 Aml *method, *crs, *ifctx, *UUID, *ifctx1, *elsectx, *buf;
-int i, bus_no;
+int i, slot_no;
 hwaddr base_mmio = memmap[VIRT_PCIE_MMIO].base;
 hwaddr size_mmio = memmap[VIRT_PCIE_MMIO].size;
 hwaddr base_pio = memmap[VIRT_PCIE_PIO].base;
@@ -170,12 +170,12 @@ static void acpi_dsdt_add_pci(Aml *scope, const 
MemMapEntry *memmap,
 aml_append(dev, aml_name_decl("_CCA", aml_int(1)));
 
 /* Declare the PCI Routing Table. */
-Aml *rt_pkg = aml_varpackage(nr_pcie_buses * PCI_NUM_PINS);
-for (bus_no = 0; bus_no < nr_pcie_buses; bus_no++) {
+Aml *rt_pkg = aml_varpackage(PCI_SLOT_MAX * PCI_NUM_PINS);
+for (slot_no = 0; slot_no < PCI_SLOT_MAX; slot_no++) {
 for (i = 0; i < PCI_NUM_PINS; i++) {
-int gsi = (i + bus_no) % PCI_NUM_PINS;
+int gsi = (i + slot_no) % PCI_NUM_PINS;
 Aml *pkg = aml_package(4);
-aml_append(pkg, aml_int((bus_no << 16) | 0x));
+aml_append(pkg, aml_int((slot_no << 16) | 0x));
 aml_append(pkg, aml_int(i));
 aml_append(pkg, aml_name("GSI%d", gsi));
 aml_append(pkg, aml_int(0));
-- 
2.19.1

[PATCH v3 7/7] virt/acpi: update golden masters for DSDT update

2020-02-03 Thread Heyi Guo

Differences between disassembled ASL files:

@@ -5,13 +5,13 @@
  *
  * Disassembling to symbolic ASL+ operators
  *
- * Disassembly of DSDT, Thu Jan 23 16:00:04 2020
+ * Disassembly of DSDT.new, Thu Jan 23 16:47:12 2020
  *
  * Original Table Header:
  * Signature"DSDT"
- * Length   0x481E (18462)
+ * Length   0x14BB (5307)
  * Revision 0x02
- * Checksum 0x60
+ * Checksum 0xD1
  * OEM ID   "BOCHS "
  * OEM Table ID "BXPCDSDT"
  * OEM Revision 0x0001 (1)
@@ -43,7 +43,6 @@ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPCDSDT", 
0x0001)
 0x0021,
 }
 })
-Name (_ADR, 0x0900)  // _ADR: Address
 }

 Device (FLS0)
@@ -668,11 +667,10 @@ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPCDSDT", 
0x0001)
 Name (_CID, "PNP0A03" /* PCI Bus */)  // _CID: Compatible ID
 Name (_SEG, Zero)  // _SEG: PCI Segment
 Name (_BBN, Zero)  // _BBN: BIOS Bus Number
-Name (_ADR, Zero)  // _ADR: Address
 Name (_UID, "PCI0")  // _UID: Unique ID
 Name (_STR, Unicode ("PCIe 0 Device"))  // _STR: Description String
 Name (_CCA, One)  // _CCA: Cache Coherency Attribute
-Name (_PRT, Package (0x0400)  // _PRT: PCI Routing Table
+Name (_PRT, Package (0x80)  // _PRT: PCI Routing Table
 {
 Package (0x04)
 {
@@ -1696,7174 +1694,6 @@ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPCDSDT", 
0x0001)
 0x03,
 GSI2,
 Zero
-},
-
-Package (0x04)
-{
-0x0020,
-Zero,
-GSI0,
-Zero
-},
-
-*Omit the other (4 * (256 - 32) - 2) packages*
-
-Package (0x04)
-{
-0x00FF,
-0x03,
-GSI2,
-Zero
 }
 })
 Device (GSI0)
@@ -8892,7 +1722,7 @@ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPCDSDT", 
0x0001)
 Device (GSI1)
 {
 Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */)  // 
_HID: Hardware ID
-Name (_UID, Zero)  // _UID: Unique ID
+Name (_UID, One)  // _UID: Unique ID
 Name (_PRS, ResourceTemplate ()  // _PRS: Possible Resource 
Settings
 {
 Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, 
,, )
@@ -8915,7 +1745,7 @@ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPCDSDT", 
0x0001)
 Device (GSI2)
 {
 Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */)  // 
_HID: Hardware ID
-Name (_UID, Zero)  // _UID: Unique ID
+Name (_UID, 0x02)  // _UID: Unique ID
 Name (_PRS, ResourceTemplate ()  // _PRS: Possible Resource 
Settings
 {
 Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, 
,, )
@@ -8938,7 +1768,7 @@ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPCDSDT", 
0x0001)
 Device (GSI3)
 {
 Name (_HID, "PNP0C0F" /* PCI Interrupt Link Device */)  // 
_HID: Hardware ID
-Name (_UID, Zero)  // _UID: Unique ID
+Name (_UID, 0x03)  // _UID: Unique ID
 Name (_PRS, ResourceTemplate ()  // _PRS: Possible Resource 
Settings
 {
 Interrupt (ResourceConsumer, Level, ActiveHigh, Exclusive, 
,, )
@@ -8965,37 +1795,6 @@ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPCDSDT", 
0x0001)

 Method (_CRS, 0, NotSerialized)  // _CRS: Current Resource Settings
 {
-Name (RBUF, ResourceTemplate ()
-{
-WordBusNumber (ResourceProducer, MinFixed, MaxFixed, 
PosDecode,
-0x, // Granularity
-0x, // Range Minimum
-0x00FF, // Range Maximum
-0x, // Translation Offset
-0x0100, // Length
-,, )
-DWordMemory (ResourceProducer, PosDecode, MinFixed, 
MaxFixed, NonCacheable, ReadWrite,
-0x, // Granularity
-0x1000, // Range Minimum
-0x3EFE, // Range Maximum
-0x, // Translation Offset
-0x2EFF, // Length
-,, , AddressRangeMemory, TypeStatic)
-DWordIO (ResourceProducer, Min

[PATCH v3 5/7] arm/acpi: fix duplicated _UID of PCI interrupt link devices

2020-02-03 Thread Heyi Guo

Using _UID of 0 for all PCI interrupt link devices absolutely violates
the spec. Simply increase one by one.

Signed-off-by: Heyi Guo 
Reviewed-by: Michael S. Tsirkin 

---
Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Shannon Zhao 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org
---
 hw/arm/virt-acpi-build.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 5d157a9dd5..f3e340b172 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -189,7 +189,7 @@ static void acpi_dsdt_add_pci(Aml *scope, const MemMapEntry 
*memmap,
 uint32_t irqs =  irq + i;
 Aml *dev_gsi = aml_device("GSI%d", i);
 aml_append(dev_gsi, aml_name_decl("_HID", aml_string("PNP0C0F")));
-aml_append(dev_gsi, aml_name_decl("_UID", aml_int(0)));
+aml_append(dev_gsi, aml_name_decl("_UID", aml_int(i)));
 crs = aml_resource_template();
 aml_append(crs,
aml_interrupt(AML_CONSUMER, AML_LEVEL, AML_ACTIVE_HIGH,
-- 
2.19.1

[PATCH v3 0/7] Some cleanup in arm/virt/acpi

2020-02-03 Thread Heyi Guo

Remove conflict _ADR objects, and fix and refine PCI device definition in
ACPI/DSDT.

History:

v3 -> v2:
- update commit message for patch 4/7.
- remove diff keywords in commit message of patch 7/7 to avoid applying patch
  failure.

v1 -> v2:
- flow the work flow in tests/qtest/bios-table-test.c to post ACPI related
  patches.
- update commit messages for removing "RP0" and "_ADR".
- add 3 more cleanup patches.

Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Shannon Zhao 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org

Heyi Guo (7):
  bios-tables-test: prepare to change ARM virt ACPI DSDT
  arm/virt/acpi: remove meaningless sub device "RP0" from PCI0
  arm/virt/acpi: remove _ADR from devices identified by _HID
  arm/acpi: fix PCI _PRT definition
  arm/acpi: fix duplicated _UID of PCI interrupt link devices
  arm/acpi: simplify the description of PCI _CRS
  virt/acpi: update golden masters for DSDT update

 hw/arm/virt-acpi-build.c  |  25 ++---
 tests/data/acpi/virt/DSDT | Bin 18462 -> 5307 bytes
 tests/data/acpi/virt/DSDT.memhp   | Bin 19799 -> 6644 bytes
 tests/data/acpi/virt/DSDT.numamem | Bin 18462 -> 5307 bytes
 4 files changed, 6 insertions(+), 19 deletions(-)

-- 
2.19.1

Re: [PATCH v2 0/7] Some cleanup in arm/virt/acpi

2020-02-03 Thread Heyi Guo




在 2020/2/3 22:03, Peter Maydell 写道:

On Mon, 3 Feb 2020 at 13:33, Heyi Guo  wrote:


在 2020/2/3 14:43, Michael S. Tsirkin 写道:

On Mon, Feb 03, 2020 at 08:14:58AM +0800, Heyi Guo wrote:

Remove conflict _ADR objects, and fix and refine PCI device definition in
ACPI/DSDT.

Cc: Peter Maydell 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Shannon Zhao 
Cc: qemu-...@nongnu.org
Cc: qemu-devel@nongnu.org

Series

Reviewed-by: Michael S. Tsirkin 

merge through ARM tree pls.

Thanks, Michael :)


Hi Peter,

Do I need to send v3 to update the commit message of patch 4/7 as
Michael suggested?

This patchset seems to be corrupt somehow:

e104462:bionic:qemu$ patches apply -s
id:20200203001505.52573-1-guoh...@huawei.com
Applying: bios-tables-test: prepare to change ARM virt ACPI DSDT
Applying: arm/virt/acpi: remove meaningless sub device "RP0" from PCI0
Applying: arm/virt/acpi: remove _ADR from devices identified by _HID
Applying: arm/acpi: fix PCI _PRT definition
Applying: arm/acpi: fix duplicated _UID of PCI interrupt link devices
Applying: arm/acpi: simplify the description of PCI _CRS
Applying: virt/acpi: update golden masters for DSDT update
error: corrupt patch at line 68
error: could not build fake ancestor
hint: Use 'git am --show-current-patch' to see the failed patch
Patch failed at 0007 virt/acpi: update golden masters for DSDT update
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Patchew didn't like it either:
https://patchew.org/QEMU/20200203001505.52573-1-guoh...@huawei.com/

I think the problem here is the quoting of the diff in
the commit message of patch 7: git am and friends think
that is part of the actual patch body and get confused.
You might be able to avoid that by not putting the
   diff --git a/DSDT.dsl.orig b/DSDT.dsl
   index ed3e5f0fa9..10cf70c886 100644
   --- a/DSDT.dsl.orig
   +++ b/DSDT.dsl

part in the commit message, but I haven't tested that.

So resending a v4 would probably be a good idea anyway.


Sorry for trouble. I should have tried locally before sending the patch.

Yes, removing the above 4 lines help git am success.

I'll send a v3.

Thanks,

Heyi



thanks
-- PMM

.

Re: [PATCH] pl031: add finalize function to avoid memleaks

2020-02-03 Thread Pan Nengyuan




On 2/3/2020 5:58 PM, Peter Maydell wrote:
> On Mon, 3 Feb 2020 at 07:47,  wrote:
>>
>> From: Pan Nengyuan 
>>
>> There is a memory leak when we call 'device_list_properties' with
>> typename = pl031. It's easy to reproduce as follow:
>>
>>   virsh qemu-monitor-command vm1 --pretty '{"execute": 
>> "device-list-properties", "arguments": {"typename": "pl031"}}'
>>
>> The memory leak stack:
>>   Direct leak of 48 byte(s) in 1 object(s) allocated from:
>> #0 0x7f6e0925a970 in __interceptor_calloc (/lib64/libasan.so.5+0xef970)
>> #1 0x7f6e06f4d49d in g_malloc0 (/lib64/libglib-2.0.so.0+0x5249d)
>> #2 0x564a0f7654ea in timer_new_full 
>> /mnt/sdb/qemu/include/qemu/timer.h:530
>> #3 0x564a0f76555d in timer_new /mnt/sdb/qemu/include/qemu/timer.h:551
>> #4 0x564a0f765589 in timer_new_ns /mnt/sdb/qemu/include/qemu/timer.h:569
>> #5 0x564a0f76747d in pl031_init /mnt/sdb/qemu/hw/rtc/pl031.c:198
>> #6 0x564a0fd4a19d in object_init_with_type /mnt/sdb/qemu/qom/object.c:360
>> #7 0x564a0fd4b166 in object_initialize_with_type 
>> /mnt/sdb/qemu/qom/object.c:467
>> #8 0x564a0fd4c8e6 in object_new_with_type /mnt/sdb/qemu/qom/object.c:636
>> #9 0x564a0fd4c98e in object_new /mnt/sdb/qemu/qom/object.c:646
>> #10 0x564a0fc69d43 in qmp_device_list_properties 
>> /mnt/sdb/qemu/qom/qom-qmp-cmds.c:204
>> #11 0x564a0ef18e64 in qdev_device_help /mnt/sdb/qemu/qdev-monitor.c:278
>>
>> Reported-by: Euler Robot 
>> Signed-off-by: Pan Nengyuan 
>> ---
>>  hw/rtc/pl031.c | 10 ++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/hw/rtc/pl031.c b/hw/rtc/pl031.c
>> index ae47f09635..50664ca000 100644
>> --- a/hw/rtc/pl031.c
>> +++ b/hw/rtc/pl031.c
>> @@ -194,6 +194,15 @@ static void pl031_init(Object *obj)
>>  s->timer = timer_new_ns(rtc_clock, pl031_interrupt, s);
>>  }
>>
>> +static void pl031_finalize(Object *obj)
>> +{
>> +PL031State *s = PL031(obj);
>> +if (s->timer) {
>> +timer_del(s->timer);
>> +timer_free(s->timer);
>> +}
>> +}
>> +
>>  static int pl031_pre_save(void *opaque)
>>  {
>>  PL031State *s = opaque;
>> @@ -329,6 +338,7 @@ static const TypeInfo pl031_info = {
>>  .parent= TYPE_SYS_BUS_DEVICE,
>>  .instance_size = sizeof(PL031State),
>>  .instance_init = pl031_init,
>> +.instance_finalize = pl031_finalize,
>>  .class_init= pl031_class_init,
>>  };
> 
> The more usual way to fix this I think is to delay
> the timer_new until realize rather than putting it
> into instance_init, since the pl031 can't be
> hotplugged.

Hmm, you are right, I will change it in next version.

Thanks.

> 
> thanks
> -- PMM
> .
>

Re: [PATCH 0/4] linux-user: fix use of SIGRTMIN

2020-02-03 Thread Josh Kunz

On Sat, Feb 1, 2020 at 4:27 AM Laurent Vivier  wrote:
> This has been tested with Go (golang 1.10.1 linux/arm64, bionic) on x86_64
> fedora 31. We can avoid the failure in this case allowing the unsupported
> signals when we don't provide the "act" parameters to sigaction, only the
> "oldact" one. I have also run the LTP suite with several target and debian
> based distros.

This breaks with go1.13+ (I also verified at hash 753d56d364)[1]. I
tested using an aarch64 guest on an x86 system, but this should
manifest on any architecture when the guest/host have the same number
of signals (and glibc reserves some host signals). From the traceback,
you can see it dies in `initsig` which is called at startup. Any Go
program should fail.

Since go does not use a libc, it assumes that all signals from
[1.._NSIG) are available[2], and will panic if it cannot do an
rt_sigaction for all of them. Go already has some special handling for
QEMU where it silently discards failing rt_sigaction calls to signals
32, 33, and 64 [3]. Since this patch reserves an extra signal for
__SIGRTMIN+1 as well, it blocks out guest signal 63 and Go fails to
initialize.

While I personally support this patch series (current handling of
guest glibc signals is broken), it *will* break Go binaries. I don't
know of a way to avoid this while supporting guest __SIGRTMIN+1,
without either doing true signal multiplexing, or patching Go.

[1] https://gist.github.com/joshkunz/b6c80724072cc1dce79a6253d40b016f
[2] 
https://github.com/golang/go/blob/67f0f83216930e053441500e2b28c3fa2b667581/src/runtime/signal_unix.go#L123
[3] https://github.com/golang/go/blob/master/src/runtime/os_linux.go#L466-L473

>
> Laurent Vivier (4):
>   linux-user: add missing TARGET_SIGRTMIN for hppa
>   linux-user: cleanup signal.c
>   linux-user: fix TARGET_NSIG and _NSIG uses
>   linux-user: fix use of SIGRTMIN
>
>  linux-user/hppa/target_signal.h |   1 +
>  linux-user/signal.c | 110 +++-
>  linux-user/trace-events |   3 +
>  3 files changed, 85 insertions(+), 29 deletions(-)
>
> --
> 2.24.1
>

Re: [PATCH] configure: Fix typo of the have_afalg variable

2020-02-03 Thread Longpeng (Mike)

在 2020/2/4 0:00, Thomas Huth 写道:
> The variable is called 'have_afalg' and not 'hava_afalg'.
> 
> Fixes: f0d92b56d88 ('introduce some common functions for af_alg backend')
> Signed-off-by: Thomas Huth 
> ---
>  configure | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/configure b/configure
> index 5095f01728..115dc38085 100755
> --- a/configure
> +++ b/configure
> @@ -5843,7 +5843,7 @@ fi
>  
>  ##
>  # check for usable AF_ALG environment
> -hava_afalg=no
> +have_afalg=no
>  cat > $TMPC << EOF
>  #include 
>  #include 
> 
Reviewed-by: Longpeng(Mike) 

-- 
Regards,
Longpeng(Mike)

Re: VW ELF loader

2020-02-03 Thread Paolo Bonzini

Il mar 4 feb 2020, 00:20 Alexey Kardashevskiy  ha scritto:

>
>
> Speaking seriously, what would I put into the guest?
>

Only things that would be considered drivers. Ignore the partitions issue
for now so that you can just pass the device tree services to QEMU with
hypercalls.

Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
> smaller but adhoc with only a couple of people knowing it.
>

You can generalize and reuse the s390 code. All you have to write is the
PCI scan and virtio-pci setup.

Paolo

Re: VW ELF loader

2020-02-03 Thread Alexey Kardashevskiy




On 04/02/2020 09:56, Paolo Bonzini wrote:
> 
> 
> Il lun 3 feb 2020, 23:36 Alexey Kardashevskiy  > ha scritto:
> 
> 
> > What partition formats would have to be supported?
> 
> MBR, GPT, is there anything else? "Support" is limited to converting a
> number after command to [start, size] couple. I am not going for file
> systems.
> 
> > But honestly I'm
> > more worried about the networking part.
> 
> Fair enough.
> 
> > Yes, SLOF is big and slow.  petitboot is not petit at all either, and
> > has the disadvantage that you have to find a way to run GRUB
> afterwards.
> >  But would a similarly minimal OF implementation (no network,
> almost no
> > interpret so no Forth, device tree built entirely in the host, etc.)
> > be just as big and slow?
> 
> I doubt. We will be getting rid of unnecessary drivers, bus scanning
> code (SCSI, PCI), device tree synchronization.
> 
> 
> What I mean is, if you write a firmware that exposes a minimal OF device
> interface but runs it in the guest, and does a hypercall for everything
> else, would it be as big and slow as SLOF?

I just did almost that - 20 bytes, fast as a bullet, runs in the guest ;)

Speaking seriously, what would I put into the guest?

The device tree? This is the core problem of the current design - we
need to keep it in sync with QEMU.

Netboot's dhcp/tftp/ip/ipv6 client? It is going to be another SLOF,
smaller but adhoc with only a couple of people knowing it. Other
packages - disk-label, deblocker - I do not see any user for these
except SLOF itself.


-- 
Alexey

RE: [PATCH 4/4] linux-user: fix use of SIGRTMIN

2020-02-03 Thread Taylor Simpson




> -Original Message-
> From: Laurent Vivier 
> Sent: Saturday, February 1, 2020 6:28 AM
> To: qemu-devel@nongnu.org
> Cc: Josh Kunz ; milos.stojano...@rt-rk.com; Matus Kysel
> ; Aleksandar Markovic  rk.com>; Marlies Ruck ; Laurent Vivier
> ; Peter Maydell ; Taylor
> Simpson ; Riku Voipio 
> Subject: [PATCH 4/4] linux-user: fix use of SIGRTMIN
>
> Some RT signals can be in use by glibc,
> it's why SIGRTMIN (34) is generally greater than __SIGRTMIN (32).
>
> So SIGRTMIN cannot be mapped to TARGET_SIGRTMIN.
>
> Instead of swapping only SIGRTMIN and SIGRTMAX, map all the range
> [TARGET_SIGRTMIN ... TARGET_SIGRTMAX - X] to
>   [__SIGRTMIN + X ... SIGRTMAX ]
> (SIGRTMIN is __SIGRTMIN + X).
>
> Signed-off-by: Laurent Vivier 
> ---
>  linux-user/signal.c | 34 --
>  linux-user/trace-events |  3 +++
>  2 files changed, 31 insertions(+), 6 deletions(-)
>
> diff --git a/linux-user/signal.c b/linux-user/signal.c index
> 3491f0a7ecb1..c4abacde49a0 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -501,15 +501,20 @@ static void signal_table_init(void)
>  int i, j;
>
>  /*
> - * Nasty hack: Reverse SIGRTMIN and SIGRTMAX to avoid overlap with
> - * host libpthread signals.  This assumes no one actually uses SIGRTMAX 
> :-/
> - * To fix this properly we need to do manual signal delivery multiplexed
> - * over a single host signal.
> + * some RT signals can be in use by glibc,
> + * it's why SIGRTMIN (34) is generally greater than __SIGRTMIN (32)
>   */
> -host_to_target_signal_table[__SIGRTMIN] = __SIGRTMAX;
> -host_to_target_signal_table[__SIGRTMAX] = __SIGRTMIN;
> +for (i = SIGRTMIN; i <= SIGRTMAX; i++) {
> +j = i - SIGRTMIN + TARGET_SIGRTMIN;
> +if (j <= TARGET_NSIG) {
> +host_to_target_signal_table[i] = j;
> +}
> +}
>
>  /* generate signal conversion tables */
> +for (i = 1; i <= TARGET_NSIG; i++) {
> +target_to_host_signal_table[i] = _NSIG; /* poison */
> +}
>  for (i = 1; i < _NSIG; i++) {
>  if (host_to_target_signal_table[i] == 0) {
>  host_to_target_signal_table[i] = i; @@ -519,6 +524,15 @@ static 
> void
> signal_table_init(void)
>  target_to_host_signal_table[j] = i;
>  }
>  }
> +
> +if (TRACE_SIGNAL_TABLE_INIT_BACKEND_DSTATE()) {
> +for (i = 1, j = 0; i <= TARGET_NSIG; i++) {
> +if (target_to_host_signal_table[i] == _NSIG) {
> +j++;
> +}
> +}
> +trace_signal_table_init(j);

It looks like j will have a count of the number of poison entries, but the 
message in trace_signal_table_init is "missing signal number".  Is that what 
you intend?

> +}
>  }
>
>  void signal_init(void)
> @@ -817,6 +831,8 @@ int do_sigaction(int sig, const struct target_sigaction
> *act,
>  int host_sig;
>  int ret = 0;
>
> +trace_signal_do_sigaction_guest(sig, TARGET_NSIG);

Shouldn't this be _NSIG, not TARGET_NSIG?

> +
>  if (sig < 1 || sig > TARGET_NSIG || sig == TARGET_SIGKILL || sig ==
> TARGET_SIGSTOP) {
>  return -TARGET_EINVAL;
>  }
> @@ -847,6 +863,12 @@ int do_sigaction(int sig, const struct target_sigaction
> *act,
>
>  /* we update the host linux signal state */
>  host_sig = target_to_host_signal(sig);
> +trace_signal_do_sigaction_host(host_sig, TARGET_NSIG);
> +if (host_sig > SIGRTMAX) {
> +/* we don't have enough host signals to map all target signals */
> +qemu_log_mask(LOG_UNIMP, "Unsupported target signal #%d\n",
> sig);
> +return -TARGET_EINVAL;
> +}
>  if (host_sig != SIGSEGV && host_sig != SIGBUS) {
>  sigfillset(&act1.sa_mask);
>  act1.sa_flags = SA_SIGINFO; diff --git a/linux-user/trace-events
> b/linux-user/trace-events index f6de1b8befc0..eb4b7701c400 100644
> --- a/linux-user/trace-events
> +++ b/linux-user/trace-events
> @@ -1,6 +1,9 @@
>  # See docs/devel/tracing.txt for syntax documentation.
>
>  # signal.c
> +signal_table_init(int i) "missing signal number: %d"
> +signal_do_sigaction_guest(int sig, int max) "target signal %d (MAX %d)"
> +signal_do_sigaction_host(int sig, int max) "host signal %d (MAX %d)"
>  # */signal.c
>  user_setup_frame(void *env, uint64_t frame_addr) "env=%p
> frame_addr=0x%"PRIx64  user_setup_rt_frame(void *env, uint64_t
> frame_addr) "env=%p frame_addr=0x%"PRIx64
> --
> 2.24.1
>

RE: [PATCH 3/4] linux-user: fix TARGET_NSIG and _NSIG uses

2020-02-03 Thread Taylor Simpson




> -Original Message-
> From: Laurent Vivier 
> Sent: Saturday, February 1, 2020 6:28 AM
> To: qemu-devel@nongnu.org
> Cc: Josh Kunz ; milos.stojano...@rt-rk.com; Matus Kysel
> ; Aleksandar Markovic  rk.com>; Marlies Ruck ; Laurent Vivier
> ; Peter Maydell ; Taylor
> Simpson ; Riku Voipio 
> Subject: [PATCH 3/4] linux-user: fix TARGET_NSIG and _NSIG uses
>
> Valid signal numbers are between 1 (SIGHUP) and SIGRTMAX.
>
> System includes define _NSIG to SIGRTMAX + 1, but QEMU (like kernel)
> defines TARGET_NSIG to TARGET_SIGRTMAX.
>
> Fix all the checks involving the signal range.
>
> Signed-off-by: Laurent Vivier 
> ---
>  linux-user/signal.c | 51 -
>  1 file changed, 37 insertions(+), 14 deletions(-)
>
> diff --git a/linux-user/signal.c b/linux-user/signal.c index
> f42a2e1a82a5..3491f0a7ecb1 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -30,6 +30,15 @@ static struct target_sigaction
> sigact_table[TARGET_NSIG];  static void host_signal_handler(int
> host_signum, siginfo_t *info,
>  void *puc);
>
> +
> +/*
> + * System includes define _NSIG as SIGRTMAX + 1,
> + * but qemu (like the kernel) defines TARGET_NSIG as TARGET_SIGRTMAX
> + * and the first signal is SIGHUP defined as 1
> + * Signal number 0 is reserved for use as kill(pid, 0), to test whether
> + * a process exists without sending it a signal.
> + */
> +QEMU_BUILD_BUG_ON(__SIGRTMAX + 1 != _NSIG);
>  static uint8_t host_to_target_signal_table[_NSIG] = {
>  [SIGHUP] = TARGET_SIGHUP,
>  [SIGINT] = TARGET_SIGINT,
> @@ -67,19 +76,24 @@ static uint8_t host_to_target_signal_table[_NSIG] = {
>  [SIGSYS] = TARGET_SIGSYS,
>  /* next signals stay the same */
>  };
> -static uint8_t target_to_host_signal_table[_NSIG];
>
> +static uint8_t target_to_host_signal_table[TARGET_NSIG + 1];
> +
> +/* valid sig is between 1 and _NSIG - 1 */
>  int host_to_target_signal(int sig)
>  {
> -if (sig < 0 || sig >= _NSIG)
> +if (sig < 1 || sig >= _NSIG) {
>  return sig;
> +}
>  return host_to_target_signal_table[sig];  }
>
> +/* valid sig is between 1 and TARGET_NSIG */
>  int target_to_host_signal(int sig)
>  {
> -if (sig < 0 || sig >= _NSIG)
> +if (sig < 1 || sig > TARGET_NSIG) {
>  return sig;
> +}
>  return target_to_host_signal_table[sig];  }
>
> @@ -100,11 +114,15 @@ static inline int target_sigismember(const
> target_sigset_t *set, int signum)  void
> host_to_target_sigset_internal(target_sigset_t *d,
>  const sigset_t *s)  {
> -int i;
> +int i, j;
>  target_sigemptyset(d);
> -for (i = 1; i <= TARGET_NSIG; i++) {
> +for (i = 1; i < _NSIG; i++) {
> +j = host_to_target_signal(i);

More descriptive name - target_sig

> +if (j < 1 || j > TARGET_NSIG) {
> +continue;
> +}
>  if (sigismember(s, i)) {
> -target_sigaddset(d, host_to_target_signal(i));
> +target_sigaddset(d, j);
>  }
>  }
>  }
> @@ -122,11 +140,15 @@ void host_to_target_sigset(target_sigset_t *d,
> const sigset_t *s)  void target_to_host_sigset_internal(sigset_t *d,
>  const target_sigset_t *s)  {
> -int i;
> +int i, j;
>  sigemptyset(d);
>  for (i = 1; i <= TARGET_NSIG; i++) {
> +j = target_to_host_signal(i);

More descriptive name - host_sig

> +if (j < 1 || j >= _NSIG) {
> +continue;
> +}
>  if (target_sigismember(s, i)) {
> -sigaddset(d, target_to_host_signal(i));
> +sigaddset(d, j);
>  }
>  }
>  }
> @@ -488,13 +510,14 @@ static void signal_table_init(void)
>  host_to_target_signal_table[__SIGRTMAX] = __SIGRTMIN;
>
>  /* generate signal conversion tables */
> -for(i = 1; i < _NSIG; i++) {
> -if (host_to_target_signal_table[i] == 0)
> +for (i = 1; i < _NSIG; i++) {
> +if (host_to_target_signal_table[i] == 0) {
>  host_to_target_signal_table[i] = i;
> -}
> -for(i = 1; i < _NSIG; i++) {
> +}
>  j = host_to_target_signal_table[i];

More descriptive name - target_sig

> -target_to_host_signal_table[j] = i;
> +if (j <= TARGET_NSIG) {
> +target_to_host_signal_table[j] = i;
> +}
>  }
>  }
>
> @@ -517,7 +540,7 @@ void signal_init(void)
>  act.sa_sigaction = host_signal_handler;
>  for(i = 1; i <= TARGET_NSIG; i++) {  #ifdef TARGET_GPROF
> -if (i == SIGPROF) {
> +if (i == TARGET_SIGPROF) {
>  continue;
>  }
>  #endif
> --
> 2.24.1
>

RE: [PATCH 2/4] linux-user: cleanup signal.c

2020-02-03 Thread Taylor Simpson




> -Original Message-
> From: Laurent Vivier 
> Sent: Saturday, February 1, 2020 6:28 AM
> To: qemu-devel@nongnu.org
> Cc: Josh Kunz ; milos.stojano...@rt-rk.com; Matus Kysel
> ; Aleksandar Markovic  rk.com>; Marlies Ruck ; Laurent Vivier
> ; Peter Maydell ; Taylor
> Simpson ; Riku Voipio 
> Subject: [PATCH 2/4] linux-user: cleanup signal.c
>
> -
> CAUTION: This email originated from outside of the organization.
> -
>
> No functionnal changes. Prepare the field for future fixes.


Spelling error

>
> Remove memset(.., 0, ...) that is useless on a static array
>
> Signed-off-by: Laurent Vivier 
> ---
>  linux-user/signal.c | 37 ++---
>  1 file changed, 22 insertions(+), 15 deletions(-)
>
> diff --git a/linux-user/signal.c b/linux-user/signal.c index
> 5ca6d62b15d3..f42a2e1a82a5 100644
> --- a/linux-user/signal.c
> +++ b/linux-user/signal.c
> @@ -66,12 +66,6 @@ static uint8_t host_to_target_signal_table[_NSIG] = {
>  [SIGPWR] = TARGET_SIGPWR,
>  [SIGSYS] = TARGET_SIGSYS,
>  /* next signals stay the same */
> -/* Nasty hack: Reverse SIGRTMIN and SIGRTMAX to avoid overlap with
> -   host libpthread signals.  This assumes no one actually uses SIGRTMAX 
> :-/
> -   To fix this properly we need to do manual signal delivery multiplexed
> -   over a single host signal.  */
> -[__SIGRTMIN] = __SIGRTMAX,
> -[__SIGRTMAX] = __SIGRTMIN,
>  };
>  static uint8_t target_to_host_signal_table[_NSIG];
>
> @@ -480,13 +474,18 @@ static int core_dump_signal(int sig)
>  }
>  }
>
> -void signal_init(void)
> +static void signal_table_init(void)
>  {
> -TaskState *ts = (TaskState *)thread_cpu->opaque;
> -struct sigaction act;
> -struct sigaction oact;
>  int i, j;
> -int host_sig;
> +
> +/*
> + * Nasty hack: Reverse SIGRTMIN and SIGRTMAX to avoid overlap with
> + * host libpthread signals.  This assumes no one actually uses SIGRTMAX 
> :-
> /
> + * To fix this properly we need to do manual signal delivery multiplexed
> + * over a single host signal.
> + */
> +host_to_target_signal_table[__SIGRTMIN] = __SIGRTMAX;
> +host_to_target_signal_table[__SIGRTMAX] = __SIGRTMIN;
>
>  /* generate signal conversion tables */
>  for(i = 1; i < _NSIG; i++) {
> @@ -497,14 +496,22 @@ void signal_init(void)
>  j = host_to_target_signal_table[i];

Since you are cleaning up this code, let's give this a more descriptive name - 
target_sig would be consistent with host_sig used elsewhere.

>  target_to_host_signal_table[j] = i;
>  }
> +}
> +
> +void signal_init(void)
> +{
> +TaskState *ts = (TaskState *)thread_cpu->opaque;
> +struct sigaction act;
> +struct sigaction oact;
> +int i;
> +int host_sig;
> +
> +/* initialize signal conversion tables */
> +signal_table_init();
>
>  /* Set the signal mask from the host mask. */
>  sigprocmask(0, 0, &ts->signal_mask);
>
> -/* set all host signal handlers. ALL signals are blocked during
> -   the handlers to serialize them. */
> -memset(sigact_table, 0, sizeof(sigact_table));
> -
>  sigfillset(&act.sa_mask);
>  act.sa_flags = SA_SIGINFO;
>  act.sa_sigaction = host_signal_handler;
> --
> 2.24.1
>

Re: VW ELF loader

2020-02-03 Thread Paolo Bonzini

Il lun 3 feb 2020, 23:36 Alexey Kardashevskiy  ha scritto:

>
> > What partition formats would have to be supported?
>
> MBR, GPT, is there anything else? "Support" is limited to converting a
> number after command to [start, size] couple. I am not going for file
> systems.
>
> > But honestly I'm
> > more worried about the networking part.
>
> Fair enough.
>
> > Yes, SLOF is big and slow.  petitboot is not petit at all either, and
> > has the disadvantage that you have to find a way to run GRUB afterwards.
> >  But would a similarly minimal OF implementation (no network, almost no
> > interpret so no Forth, device tree built entirely in the host, etc.)
> > be just as big and slow?
>
> I doubt. We will be getting rid of unnecessary drivers, bus scanning
> code (SCSI, PCI), device tree synchronization.
>

What I mean is, if you write a firmware that exposes a minimal OF device
interface but runs it in the guest, and does a hypercall for everything
else, would it be as big and slow as SLOF?

Paolo

>
>
> --
> Alexey
>
>

RE: [PATCH 0/4] linux-user: fix use of SIGRTMIN

2020-02-03 Thread Taylor Simpson

FWIW, this removes the need for the target-specific code for Hexagon in 
signal.c.

Thanks,
Taylor

PS  Stay tuned for a Hexagon target patch series once this is merged.

> -Original Message-
> From: Laurent Vivier 
> Sent: Saturday, February 1, 2020 6:28 AM
> To: qemu-devel@nongnu.org
> Cc: Josh Kunz ; milos.stojano...@rt-rk.com; Matus Kysel
> ; Aleksandar Markovic  rk.com>; Marlies Ruck ; Laurent Vivier
> ; Peter Maydell ; Taylor
> Simpson ; Riku Voipio 
> Subject: [PATCH 0/4] linux-user: fix use of SIGRTMIN
>
> This series fixes the problem of the first real-time signals already in use by
> the glibc that are not available for the target glibc.
>
> Instead of reverting the first and last real-time signals we rely on the value
> provided by the glibc (SIGRTMIN) to know the first available signal and we
> map all the signals from this value to SIGRTMAX on top of
> TARGET_SIGRTMIN. So the consequence is we have less available signals in
> the target (generally 2) but all seems fine as at least 30 signals are still
> available.
>
> This has been tested with Go (golang 1.10.1 linux/arm64, bionic) on x86_64
> fedora 31. We can avoid the failure in this case allowing the unsupported
> signals when we don't provide the "act" parameters to sigaction, only the
> "oldact" one. I have also run the LTP suite with several target and debian
> based distros.
>
> Laurent Vivier (4):
>   linux-user: add missing TARGET_SIGRTMIN for hppa
>   linux-user: cleanup signal.c
>   linux-user: fix TARGET_NSIG and _NSIG uses
>   linux-user: fix use of SIGRTMIN
>
>  linux-user/hppa/target_signal.h |   1 +
>  linux-user/signal.c | 110 +++-
>  linux-user/trace-events |   3 +
>  3 files changed, 85 insertions(+), 29 deletions(-)
>
> --
> 2.24.1
>

[PATCH 1/3] spapr: Don't use spapr_drc_needed() in CAS code

2020-02-03 Thread Greg Kurz

We currently don't support hotplug of devices between boot and CAS. If
this happens a CAS reboot is triggered. We detect this during CAS using
the spapr_drc_needed() function which is essentially a VMStateDescription
.needed callback. Even if the condition for CAS reboot happens to be the
same as for DRC migration, it looks wrong to use a migration related helper
for this.

Introduce a helper with more explicit semantics (ie. the device attached
to this DRC is ready or not) and use it in both CAS and DRC migration code.

This doesn't change any behaviour.

Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr_drc.c |5 ++---
 hw/ppc/spapr_hcall.c   |   12 +---
 include/hw/ppc/spapr_drc.h |8 +++-
 3 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index 17aeac38016d..d512ac6e1e7f 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -455,10 +455,9 @@ void spapr_drc_reset(SpaprDrc *drc)
 }
 }
 
-bool spapr_drc_needed(void *opaque)
+static bool spapr_drc_needed(void *opaque)
 {
 SpaprDrc *drc = (SpaprDrc *)opaque;
-SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
 
 /* If no dev is plugged in there is no need to migrate the DRC state */
 if (!drc->dev) {
@@ -469,7 +468,7 @@ bool spapr_drc_needed(void *opaque)
  * We need to migrate the state if it's not equal to the expected
  * long-term state, which is the same as the coldplugged initial
  * state */
-return (drc->state != drck->ready_state);
+return !spapr_drc_device_ready(drc);
 }
 
 static const VMStateDescription vmstate_spapr_drc = {
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index b8bb66b5c0d4..7a33d79bbae9 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1642,18 +1642,24 @@ static uint32_t cas_check_pvr(SpaprMachineState *spapr, 
PowerPCCPU *cpu,
 
 static bool spapr_hotplugged_dev_before_cas(void)
 {
-Object *drc_container, *obj;
+Object *drc_container;
 ObjectProperty *prop;
 ObjectPropertyIterator iter;
 
 drc_container = container_get(object_get_root(), "/dr-connector");
 object_property_iter_init(&iter, drc_container);
 while ((prop = object_property_iter_next(&iter))) {
+SpaprDrc *drc;
+
 if (!strstart(prop->type, "link<", NULL)) {
 continue;
 }
-obj = object_property_get_link(drc_container, prop->name, NULL);
-if (spapr_drc_needed(obj)) {
+drc = SPAPR_DR_CONNECTOR(object_property_get_link(drc_container,
+  prop->name, NULL));
+if (!drc->dev) {
+continue;
+}
+if (!spapr_drc_device_ready(drc)) {
 return true;
 }
 }
diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
index 83f03cc5773c..8e8bbedb21b7 100644
--- a/include/hw/ppc/spapr_drc.h
+++ b/include/hw/ppc/spapr_drc.h
@@ -269,7 +269,13 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner, 
uint32_t drc_type_mask);
 
 void spapr_drc_attach(SpaprDrc *drc, DeviceState *d, Error **errp);
 void spapr_drc_detach(SpaprDrc *drc);
-bool spapr_drc_needed(void *opaque);
+
+static inline bool spapr_drc_device_ready(SpaprDrc *drc)
+{
+SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+
+return drc->state == drck->ready_state;
+}
 
 static inline bool spapr_drc_unplug_requested(SpaprDrc *drc)
 {

[PATCH 3/3] spapr: Migrate SpaprDrc::unplug_requested

2020-02-03 Thread Greg Kurz

Hot unplugging a device is an asynchronous operation. If the guest is
migrated after the event was sent but before it could release the
device with RTAS, the destination QEMU doesn't know about the pending
unplug operation and doesn't actually remove the device when the guest
finally releases it. The device

Migrate SpaprDrc::unplug_requested to fix the inconsistency. This is
done with a subsection that is only sent if an unplug request is
pending. This allows to preserve migration with older guests in the
case of a pending hotplug request. This will cause migration to fail
if the destination can't handle the subsection, but this is better
than ending with an inconsistency.

Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr_drc.c |   27 +--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index d512ac6e1e7f..6f5cab70fc6b 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -455,6 +455,22 @@ void spapr_drc_reset(SpaprDrc *drc)
 }
 }
 
+static bool spapr_drc_unplug_requested_needed(void *opaque)
+{
+return spapr_drc_unplug_requested(opaque);
+}
+
+static const VMStateDescription vmstate_spapr_drc_unplug_requested = {
+.name = "spapr_drc/unplug_requested",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = spapr_drc_unplug_requested_needed,
+.fields  = (VMStateField []) {
+VMSTATE_BOOL(unplug_requested, SpaprDrc),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static bool spapr_drc_needed(void *opaque)
 {
 SpaprDrc *drc = (SpaprDrc *)opaque;
@@ -467,8 +483,11 @@ static bool spapr_drc_needed(void *opaque)
 /*
  * We need to migrate the state if it's not equal to the expected
  * long-term state, which is the same as the coldplugged initial
- * state */
-return !spapr_drc_device_ready(drc);
+ * state, or if an unplug request is pending.
+ */
+return
+spapr_drc_unplug_requested_needed(drc) ||
+!spapr_drc_device_ready(drc);
 }
 
 static const VMStateDescription vmstate_spapr_drc = {
@@ -479,6 +498,10 @@ static const VMStateDescription vmstate_spapr_drc = {
 .fields  = (VMStateField []) {
 VMSTATE_UINT32(state, SpaprDrc),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (const VMStateDescription * []) {
+&vmstate_spapr_drc_unplug_requested,
+NULL
 }
 };

[PATCH 0/3] spapr: Fix device unplug vs CAS or migration

2020-02-03 Thread Greg Kurz

While working on getting rid of CAS reboot, I realized that we currently
don't handle device hot unplug properly in the following situations:

1) if the device is unplugged between boot and CAS, SLOF doesn't handle
   the even, which is a known limitation. The device hence stays around
   forever (specifically, until some other event is emitted and the guest
   eventually completes the unplug or a reboot). Until we can teach SLOF
   to correctly process the full FDT at CAS, we should trigger a CAS reboot,
   like we already do for hotplug.

2) if the guest is migrated after the even was emitted but before the
   guest could process it, the destination is unaware of the pending
   unplug operation and doesn't remove the device when the guests
   releases it. The 'unplug_requested' field of the DRC is actually state
   that should be migrated.

--
Greg

---

Greg Kurz (3):
  spapr: Don't use spapr_drc_needed() in CAS code
  spapr: Detect hot unplugged devices during CAS
  spapr: Migrate SpaprDrc::unplug_requested


 hw/ppc/spapr_drc.c |   30 ++
 hw/ppc/spapr_hcall.c   |   12 +---
 include/hw/ppc/spapr_drc.h |8 +++-
 3 files changed, 42 insertions(+), 8 deletions(-)

[PATCH 2/3] spapr: Detect hot unplugged devices during CAS

2020-02-03 Thread Greg Kurz

We can't properly handle hotplug of a device as long the guest kernel isn't
fully booted. We detect this at CAS and potentially trigger a CAS reboot to
complete the hotplug sequence.

The same goes actually with hot unplug but we currently don't detect it.
Doing device_del before CAS hence seems to be ignored: when the guest
is booted, it still sees the _unplugged_ device in the DT and configures
it. But if some other hotplug event happens later, then the unplug request
is finally processed by the guest and the device goes away.

This doesn't seem to cause any crash but it is still very confusing. Detect
device unplug at CAS and request a CAS reboot.

Hopefully, when SLOF will know how to handle device addition and deletion
in its DT according to the FDT provided by QEMU at CAS, we'll be able to
address this differently (ie, coldplugging the new devices and removing the
ones with a pending unplug request).

Signed-off-by: Greg Kurz 
---
 hw/ppc/spapr_hcall.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index 7a33d79bbae9..84690cc2c1ce 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -1659,7 +1659,7 @@ static bool spapr_hotplugged_dev_before_cas(void)
 if (!drc->dev) {
 continue;
 }
-if (!spapr_drc_device_ready(drc)) {
+if (spapr_drc_unplug_requested(drc) || !spapr_drc_device_ready(drc)) {
 return true;
 }
 }

Re: VW ELF loader

2020-02-03 Thread Alexey Kardashevskiy




On 04/02/2020 02:08, Paolo Bonzini wrote:
> On 03/02/20 11:58, Alexey Kardashevskiy wrote:
 So really, the question isn't whether we implement things in firmware
 or in qemu.  It's whether we implement the firmware functionality as
 guest cpu code, which needs to be coded to work with a limited
 environment, built with a special toolchain, then emulated with TCG.
 Or, do we just implement it in normal C code, with a full C library,
 and existing device and backend abstractions inside qemu.
>>>
>>> ... which is adding almost 2000 lines of new code to the host despite
>>> the following limitations:
>>>
 4. no networking in OF CI at all;
 5. no vga;
 6. no disk partitions in CI, i.e. no commas to select a partition -
 this relies on a bootloader accessing the disk as a whole;
>>
>> This is not going to be a lot really, especially supporting partitions -
>> the code is practically there already as I needed it to find GRUB, and
>> GRUB does the rest asking very little from the firmware to work.
> 
> What partition formats would have to be supported? 

MBR, GPT, is there anything else? "Support" is limited to converting a
number after command to [start, size] couple. I am not going for file
systems.

> But honestly I'm
> more worried about the networking part.

Fair enough.

>> btw what is the common way of netbooting in x86? NIC ROM or GRUB (but
>> this would be a disk anyway)? Can we consider having a precompiled GRUB
>> image somewhere in pc-bios/ to use for netboot? Or Uboot would do (it is
>> already in pc-bios/, no?), I suppose?
> 
> GRUB netboot support is almost never used. 

Huh. We use yaboot here in Ozlabs for netbooting quite a lot.

> There are three cases:
> 
> - QEMU BIOS: the NIC ROM contain iPXE, which is both the driver code and
> the boot loader (which chains into GRUB).
> 
> - Bare metal BIOS: same, but the boot loader is minimal so most of the
> time iPXE is loaded via TFTP and reuses the NIC ROM's driver code.
> 
> - UEFI: the NIC ROM contains driver code only and the firmware does the
> rest.

Well, we never really had this luxury of NIC ROM, there were a couple of
NICs with fcode which never really worked in SLOF.

Oh well, this is probably the time to look into netbooting then.


>>> In other words you're not dropping SLOF, you're really dropping
>>> OpenFirmware completely.
>>
>> What is the exact benefit of having OpenFirmware's "interpret"?
> 
> None, besides being able to play space invaders written in Forth.  I'm
> not against dropping most OpenFirmware capabilities, I'm against adding
> a limited (or broken depending on what you're trying to do) version that
> runs in the host.
> 
> Yes, SLOF is big and slow.  petitboot is not petit at all either, and
> has the disadvantage that you have to find a way to run GRUB afterwards.
>  But would a similarly minimal OF implementation (no network, almost no
> interpret so no Forth, device tree built entirely in the host, etc.)

The device tree is almost completely built in QEMU these days anyway,
twice during normal boot.

> be just as big and slow?

I doubt. We will be getting rid of unnecessary drivers, bus scanning
code (SCSI, PCI), device tree synchronization.


-- 
Alexey

Disabling PCI "hot-unplug" for a guest (and/or a single PCI device)

2020-02-03 Thread Laine Stump

Although I've never experienced it, due to not running Windows guests, 
I've recently learned that a Windows guest permits a user (hopefully 
only one with local admin privileges??!) to "hot-unplug" any PCI device. 
I've also learned that some hypervisor admins don't want to permit 
admins of the virtual machines they're managing to unplug PCI devices. I 
believe this is impossible to prevent on an i440fx-based machinetype, 
and can only be done on a q35-based machinetype by assigning the devices 
to the root bus (so that they are seen as integrated devices) rather 
than to a pcie-root-port. But when libvirt is assigning PCI addresses to 
devices in a q35-base guest, it will *always* assign a PCIe device to a 
pcie-root-port specifically so that hotplug is possible (this was done 
to maintain functional parity with i440fx guests, where all PCI slots 
support hotplug).



To make the above-mentioned admins happy, we need to make it possible to 
(easily) create guest configurations for q35-based virtual machines 
where the PCI devices can't be hot-unplugged by the guest OS.



Thinking in the context of a management platform (e.g. OpenStack or 
ovirt) that goes through libvirt to use QEMU (and forgetting about 
i440fx, concentrating only on q35), I can think of a few different ways 
this could be done:



1) Rather than leaving the task of assignung the PCI addresses of 
devices to libvirt (which is what essentially *all* management apps that 
use libvirt currently do), the management application could itself 
directly assign the PCI addressed of all devices to be slots on pcie.0.



This is problematic because once a management application has taken over 
the PCI address assignment of a single device, it must learn the rules 
of what type of device can be plugged into what type of PCI controller 
(including plugging in new controllers when necessary), and keep track 
of which slots on which PCI controllers are already in use - effectively 
tossing that part of libvirt's functionality / embedded knowledge / 
usefulness to management applications out the window. It's even more of 
a problem for management applications that have no provision for 
manually assigning PCI addresses - virt-manager for example only 
supports this by using "XML mode" where the froopy point-click UI is 
swapped out for an edit window where the user is simply presented with 
the full XML for a device and allowed to tweak it around as they see fit 
(including duplicate addresses, plugging the wrong kind of device into 
the wrong slot, referencing non-existent controllers, etc). (NB: you 
could argue that management could just take over PCI address assignment 
in the case of wanting hotplug disabled, and only care about / support 
pcie.0 (which makes the task much easier, since you just ignore the 
existence of any other PCI controllers, leaving you with a homogenous 
array of 32 slot x 8 functions, but becomes much more complicated if you 
want to allow a mix of hotpluggable and non-hotpluggable devices, and 
you *know* someone will)



2) libvirt could gain a knob "somewhere" in the domain XML to force a 
single device, or all devices, to be assigned to a PCI address on pcie.0 
rather than on a pcie-root-port. This could be thought of as a "hint" 
about device placement, as well as extra validation in the case that a 
PCI address has been manually assigned. So, for example, let's say a 
"hotplug='disable'" option is added somewhere at the top level of the 
domain (maybe "" inside  or something 
like that); when PCI addresses are assigned by libvirt, it would attempt 
to find a slot on a controller that didn't support hotplug. And/or a 
similar knob could be added to each device. In both cases, the setting 
would be used both when assigning PCI addresses and also to validate 
user-provided PCI addresses to assure that the desired criterion was met 
(otherwise someone would manually select a PCI address on a controller 
that supported hotplug, but then set "hotplug='disabled'" and expect 
hotplug to be magically disabled on the slot).



Some of you will remember that I proposed such a knob for libvirt a few 
years ago when we were first fleshing out support for QEMU's PCI Express 
controllers and the Q35 machinetype, and it was rejected as "libvirt 
dictating policy". Of course at that time there weren't actual users 
demanding the functionality, and now there are. Aside from that, all I 
can say is that it isn't libvirt dictating this policy, it's the user of 
libvirt, and libvirt is just following directions :-) (and that I really 
really dislike the idea of a forced handover of the entire task of 
assigning/managing device PCI addresses to management apps just because 
they decide they want to disable guest-initiated hotplug



3) qemu could add a "hotpluggable=no" commandline option to all PCI 
devices (including vfio-pci) and then do whatever is necessary to make 
sure this is honored in the emulated hardware (is it possible to se

Re: [PATCH v3 07/18] machine: Add a new function init_apicid_fn in MachineClass

2020-02-03 Thread Babu Moger




On 2/3/20 9:17 AM, Igor Mammedov wrote:
> On Wed, 29 Jan 2020 10:17:11 -0600
> Babu Moger  wrote:
> 
>> On 1/29/20 3:14 AM, Igor Mammedov wrote:
>>> On Tue, 28 Jan 2020 13:45:31 -0600
>>> Babu Moger  wrote:
>>>   
 On 1/28/20 10:29 AM, Igor Mammedov wrote:  
> On Tue, 03 Dec 2019 18:37:42 -0600
> Babu Moger  wrote:
> 
>> Add a new function init_apicid_fn in MachineClass to initialize the mode
>> specific handlers to decode the apic ids.
>>
>> Signed-off-by: Babu Moger 
>> ---
>>  include/hw/boards.h |1 +
>>  vl.c|3 +++
>>  2 files changed, 4 insertions(+)
>>
>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>> index d4fab218e6..ce5aa365cb 100644
>> --- a/include/hw/boards.h
>> +++ b/include/hw/boards.h
>> @@ -238,6 +238,7 @@ struct MachineClass {
>>   unsigned 
>> cpu_index);
>>  const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState 
>> *machine);
>>  int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
>> +void (*init_apicid_fn)(MachineState *ms);
> it's x86 specific, so why it wasn put into PCMachineClass?

 Yes. It is x86 specific for now. I tried to make it generic function so
 other OSes can use it if required(like we have done in
 possible_cpu_arch_ids). It initializes functions required to build the
 apicid for each CPUs. We need these functions much early in the
 initialization. It should be initialized before parse_numa_opts or
 machine_run_board_init(in v1.c) which are called from generic context. We
 cannot use PCMachineClass at this time.  
>>>
>>> could you point to specific patches in this series that require
>>> apic ids being initialized before parse_numa_opts and elaborate why?
>>>
>>> we already have possible_cpu_arch_ids() which could be called very
>>> early and calculates APIC IDs in x86 case, so why not reuse it?  
>>
>>
>> The current code(before this series) parses the numa information and then
>> sequentially builds the apicid. Both are done together.
>>
>> But this series separates the numa parsing and apicid generation. Numa
>> parsing is done first and after that the apicid is generated. Reason is we
>> need to know the number of numa nodes in advance to decode the apicid.
>>
>> Look at this patch.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fqemu-devel%2F157541988471.46157.6587693720990965800.stgit%40naples-babu.amd.com%2F&data=02%7C01%7Cbabu.moger%40amd.com%7C0a643dd978f149acf9d108d7a8bc487a%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163398941923379&sdata=sP2TnNaqNXRGEeQNhJMna3wyeBqN0XbNKqgsCTVDaOQ%3D&reserved=0
>>
>> static inline apic_id_t apicid_from_topo_ids_epyc(X86CPUTopoInfo *topo_info,
>> +  const X86CPUTopoIDs
>> *topo_ids)
>> +{
>> +return (topo_ids->pkg_id  << apicid_pkg_offset_epyc(topo_info)) |
>> +   (topo_ids->llc_id << apicid_llc_offset_epyc(topo_info)) |
>> +   (topo_ids->die_id  << apicid_die_offset(topo_info)) |
>> +   (topo_ids->core_id << apicid_core_offset(topo_info)) |
>> +   topo_ids->smt_id;
>> +}
>>
>>
>> The function apicid_from_topo_ids_epyc builds the apicid. New decode adds
>> llc_id(which is numa id here) to the current decoding. Other fields are
>> mostly remains same.
> 
> If llc_id is the same as numa id, why not reuse CpuInstanceProperties::node-id
> instead of llc_id you are adding in previous patch 6/18?
> 
I tried to use that earlier. But dropped the idea as it required some
changes. Don't remember exactly now. I am going to investigate again if we
can use the node_id for our purpose here. Will let you know if I have any
issues.

Re: Need help understanding assertion fail.

2020-02-03 Thread Wayne Li

I see.  So you're saying that it might be possible that my guest could be
generating TCG ops that can't be translated into PPC instructions because
the displacement value is to big.  While the same TCG ops can be translated
into x86 instructions because x86 allows for a bigger displacement value.
But on the other hand it could be some other problem causing me to have a
large displacement value.

In that case, I think it'd be super helpful if I print out this
displacement value in the TCG ops when running on PPC versus x86 because
they should be the same right?  What option in QEMU -d allows me to see
generated TCG ops?  Doing a -d --help shows the following options:

out_asmshow generated host assembly code for each compiled TB
in_asm show target assembly code for each compiled TB
op show micro ops for each compiled TB
op_opt show micro ops (x86 only: before eflags optimization) and
after liveness analysis
intshow interrupts/exceptions in short format
exec   show trace before each executed TB (lots of logs)
cpushow CPU state before block translation
mmulog MMU-related activities
pcall  x86 only: show protected mode far calls/returns/exceptions
cpu_reset  show CPU state before CPU resets
ioport show all i/o ports accesses
unimp  log unimplemented functionality
guest_errors log when the guest OS does something invalid (eg accessing a
non-existent register)

There doesn't seem to be any option to print out the TCG ops specifically?
Maybe I'll have to go into the code to add print statements that print out
the TCG ops?

-Thanks!, Wayne Li

On Mon, Feb 3, 2020 at 10:56 AM Peter Maydell 
wrote:

> On Mon, 3 Feb 2020 at 16:39, Wayne Li  wrote:
> > Anyway that's the background.  The specific problem I'm having right now
> is I get the following assertion error during some of the setup stuff our
> OS does post boot-up (the OS is also custom-made):
> >
> > qemu_programs/qemu/tcg/ppc/tcg-target.inc.c:224: reloc_pc14_val:
> Assertion `disp == (int16_t) disp' failed.
> >
> > Looking at the QEMU code, "disp" is the difference between two pointers
> named "target" and "pc".  I'm not sure exactly what either of those names
> mean.  And it looks like since the assertion is checking if casting "disp"
> as a short changes the value, it's checking if the "disp" value is too
> big?  I'm just not very sure what this assertion means.
>
> This assertion is checking that we're not trying to fit too
> large a value into the host PPC branch instruction we just emitted.
> That is, tcg_out_bc() emits a PPC conditional branch instruction,
> which has a 14 bit field for the offset (it's a relative branch),
> and we know the bottom 2 bits of the target will be 0 (PPC insns
> being 4-aligned), so the distance between the current host PC
> and the target of the branch must fit in a signed 16-bit field.
>
> "disp" here stands for "displacement".
>
> The PPC TCG backend only uses this for the TCG 'brcond' and
> 'brcond2' TCG intermediate-representation ops. It seems likely
> that the code for your target is generating TCG ops which have
> too large a gap between a brcond/brcond2 and the destination label.
> You could try using the various QEMU -d options to print out the
> guest instructions and the generated TCG ops to pin down what
> part of your target is trying to generate branches over too
> much code like this.
>
> > Anyway, the thing is this problem has to be somehow related to
> > the transfer of the code from a little-endian platform to a
> > big-endian platform as our project works without any problem on
> > little-endian platforms.
>
> In this case it isn't necessarily directly an endianness issue.
> The x86 instruction set provides conditional branch instructions
> which allow a 32-bit displacement value, so you're basically never
> going to overflow a conditional-branch there. PPC, being RISC,
> has more limited branch insns. You might also run into this
> if you tried to use aarch64 (64-bit) arm hosts, which are
> little-endian but have a 19-bit branch displacement limit,
> depending on just how big you've managed to make your jumps.
> On the other hand, a 16-bit displacement is a jump over
> 64K of generated code, which is huge for a single TCG
> generated translation block, so it could well be that you
> have an endianness bug in your TCG frontend which is causing
> you to generate an enormous TB by accident.
>
> thanks
> -- PMM
>

[PATCH 1/1] hw/net/can: Introduce Xlnx ZynqMP CAN controller for QEMU

2020-02-03 Thread Vikram Garhwal

XlnxCAN is developed based on SocketCAN, QEMU CAN bus implementation.
Bus connection and socketCAN connection for each CAN module can be set
through command lines.

Signed-off-by: Vikram Garhwal 
---
 hw/net/can/Makefile.objs |1 +
 hw/net/can/xlnx-zynqmp-can.c | 1106 ++
 include/hw/net/xlnx-zynqmp-can.h |   77 +++
 3 files changed, 1184 insertions(+)
 create mode 100644 hw/net/can/xlnx-zynqmp-can.c
 create mode 100644 include/hw/net/xlnx-zynqmp-can.h

diff --git a/hw/net/can/Makefile.objs b/hw/net/can/Makefile.objs
index 9f0c4ee..0fe87dd 100644
--- a/hw/net/can/Makefile.objs
+++ b/hw/net/can/Makefile.objs
@@ -2,3 +2,4 @@ common-obj-$(CONFIG_CAN_SJA1000) += can_sja1000.o
 common-obj-$(CONFIG_CAN_PCI) += can_kvaser_pci.o
 common-obj-$(CONFIG_CAN_PCI) += can_pcm3680_pci.o
 common-obj-$(CONFIG_CAN_PCI) += can_mioe3680_pci.o
+common-obj-$(CONFIG_XLNX_ZYNQMP) += xlnx-zynqmp-can.o
diff --git a/hw/net/can/xlnx-zynqmp-can.c b/hw/net/can/xlnx-zynqmp-can.c
new file mode 100644
index 000..e14013c
--- /dev/null
+++ b/hw/net/can/xlnx-zynqmp-can.c
@@ -0,0 +1,1106 @@
+/*
+ * QEMU model of the Xilinx CAN device.
+ *
+ * Copyright (c) 2019 Xilinx Inc.
+ * Partially autogenerated by xregqemu.py 2019-06-20.
+ *
+ * Written-by: Vikram Garhwal
+ *
+ * Based on QEMU CAN Device emulation implemented by Jin Yang, Deniz Eren and
+ * Pavel Pisa
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "hw/sysbus.h"
+#include "hw/register.h"
+#include "hw/irq.h"
+#include "qapi/error.h"
+#include "qemu/bitops.h"
+#include "qemu/log.h"
+#include "qemu/cutils.h"
+#include "sysemu/sysemu.h"
+#include "migration/vmstate.h"
+#include "hw/qdev-properties.h"
+#include "net/can_emu.h"
+#include "net/can_host.h"
+#include "qemu/event_notifier.h"
+#include "qom/object_interfaces.h"
+#include "hw/net/xlnx-zynqmp-can.h"
+
+#ifndef XLNX_ZYNQMP_CAN_ERR_DEBUG
+#define XLNX_ZYNQMP_CAN_ERR_DEBUG 0
+#endif
+
+#define DB_PRINT(...) do { \
+if (XLNX_ZYNQMP_CAN_ERR_DEBUG) { \
+qemu_log(__VA_ARGS__); \
+} \
+} while (0)
+
+#define MAX_DLC8
+#undef ERROR
+
+REG32(SOFTWARE_RESET_REGISTER, 0x0)
+FIELD(SOFTWARE_RESET_REGISTER, CEN, 1, 1)
+FIELD(SOFTWARE_RESET_REGISTER, SRST, 0, 1)
+REG32(MODE_SELECT_REGISTER, 0x4)
+FIELD(MODE_SELECT_REGISTER, SNOOP, 2, 1)
+FIELD(MODE_SELECT_REGISTER, LBACK, 1, 1)
+FIELD(MODE_SELECT_REGISTER, SLEEP, 0, 1)
+REG32(ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER, 0x8)
+FIELD(ARBITRATION_PHASE_BAUD_RATE_PRESCALER_REGISTER, BRP, 0, 8)
+REG32(ARBITRATION_PHASE_BIT_TIMING_REGISTER, 0xc)
+FIELD(ARBITRATION_PHASE_BIT_TIMING_REGISTER, SJW, 7, 2)
+FIELD(ARBITRATION_PHASE_BIT_TIMING_REGISTER, TS2, 4, 3)
+FIELD(ARBITRATION_PHASE_BIT_TIMING_REGISTER, TS1, 0, 4)
+REG32(ERROR_COUNTER_REGISTER, 0x10)
+FIELD(ERROR_COUNTER_REGISTER, REC, 8, 8)
+FIELD(ERROR_COUNTER_REGISTER, TEC, 0, 8)
+REG32(ERROR_STATUS_REGISTER, 0x14)
+FIELD(ERROR_STATUS_REGISTER, ACKER, 4, 1)
+FIELD(ERROR_STATUS_REGISTER, BERR, 3, 1)
+FIELD(ERROR_STATUS_REGISTER, STER, 2, 1)
+FIELD(ERROR_STATUS_REGISTER, FMER, 1, 1)
+FIELD(ERROR_STATUS_REGISTER, CRCER, 0, 1)
+REG32(STATUS_REGISTER, 0x18)
+FIELD(STATUS_REGISTER, SNOOP, 12, 1)
+FIELD(STATUS_REGISTER, ACFBSY, 11, 1)
+FIELD(STATUS_REGISTER, TXFLL, 10, 1)
+FIELD(STATUS_REGISTER, TXBFLL, 9, 1)
+FIELD(STATUS_REGISTER, ESTAT, 7, 2)
+FIELD(STATUS_REGISTER, ERRWRN, 6, 1)
+FIELD(STATUS_REGISTER, BBSY, 5, 1)
+FIELD(STATUS_REGISTER, BIDLE, 4, 1)
+FIELD(STATUS_REGISTER, NORMAL, 3, 1)
+FIELD(STATUS_REGISTER, SLEEP, 2, 1)
+FIELD(STATUS_REGISTER, LBACK, 1, 1)
+FIELD(STATUS_REGISTER, CONFIG, 0, 1)
+REG32(INTERRUPT_STATUS_REGISTER, 0x1c)
+FIELD(INTERRUPT_STATUS_REGISTER, TXFEMP, 14, 1)
+FIELD(INTERRUPT_STATUS_REGISTER, TXFWMEMP, 13, 1)
+FIELD(INTERRUPT_STATUS_REGISTER, RXFWMFLL, 12,

[PATCH 0/1] Introduce Xlnx ZynqMP CAN controller for QEMU

2020-02-03 Thread Vikram Garhwal

Example for single CAN:
-object can-bus,id=canbus0 \
-global driver=xlnx.zynqmp-can,property=canbus0,value=canbus0 \
-object can-host-socketcan,id=socketcan0,if=vcan0,canbus=canbus0

Example for connecting both CAN:
-object can-bus,id=canbus0 -object can-bus,id=canbus1 \
-global driver=xlnx.zynqmp-can,property=canbus0,value=canbus0 \
-global driver=xlnx.zynqmp-can,property=canbus1,value=canbus1 \
-object can-host-socketcan,id=socketcan0,if=vcan0,canbus=canbus0 \
-object can-host-socketcan,id=socketcan1,if=vcan0,canbus=canbus1

Vikram Garhwal (1):
  hw/net/can: Introduce Xlnx ZynqMP CAN controller for QEMU

 hw/net/can/Makefile.objs |1 +
 hw/net/can/xlnx-zynqmp-can.c | 1106 ++
 include/hw/net/xlnx-zynqmp-can.h |   77 +++
 3 files changed, 1184 insertions(+)
 create mode 100644 hw/net/can/xlnx-zynqmp-can.c
 create mode 100644 include/hw/net/xlnx-zynqmp-can.h

-- 
2.7.4

Re: [PATCH qemu] spapr/rtas: Print message from "ibm,os-term"

2020-02-03 Thread Daniel Henrique Barboza





On 2/3/20 12:20 AM, Alexey Kardashevskiy wrote:

The "ibm,os-term" RTAS call has a single parameter which is a pointer to
a message from the guest kernel about the termination cause; this prints
it.

Signed-off-by: Alexey Kardashevskiy 
---
  hw/ppc/spapr_rtas.c | 7 +++
  1 file changed, 7 insertions(+)



Reviewed-by: Daniel Henrique Barboza 



diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 883fe28465e6..656fdd221665 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -345,6 +345,13 @@ static void rtas_ibm_os_term(PowerPCCPU *cpu,
  target_ulong args,
  uint32_t nret, target_ulong rets)
  {
+target_ulong msgaddr = rtas_ld(args, 0);
+char msg[512];
+
+cpu_physical_memory_read(msgaddr, msg, sizeof(msg) - 1);
+msg[sizeof(msg) - 1] = 0;
+
+error_report("OS terminated: %s", msg);
  qemu_system_guest_panicked(NULL);
  
  rtas_st(rets, 0, RTAS_OUT_SUCCESS);

Re: [RFC PATCH] audio: proper support for float samples in mixeng

2020-02-03 Thread Zoltán Kővágó


On 2020-02-03 11:00, Peter Maydell wrote:

On Sun, 2 Feb 2020 at 19:39, Kővágó, Zoltán  wrote:


This adds proper support for float samples in mixeng by adding a new
audio format for it.

Limitations: only native endianness is supported.


Could you explain a bit more what this limitation means, please?
In general QEMU behaviour shouldn't depend on the endianness
of the host, ie we should byteswap where necessary.


None of the virtual sound cards support float samples (it looks like 
most of them only support 8 and 16 bit, only hda supports 32 bit), it is 
only used for the audio backends (i.e. host side).  In 
audiodev_to_audsettings we set endianness to AUDIO_HOST_ENDIANNESS, so 
audio backends should always use native endian.


So this limitation should only cause problems when an audio backend 
overrides the endian setting.  Wavcapture does it, but it does not 
support float.  Alsa, sdl, puleaudio and oss can also do it if for some 
weird reason it acquires a stream with a different endianness than 
requested.


Regards,
Zoltan

Re: Making QEMU easier for management tools and applications

2020-02-03 Thread Andrea Bolognani

On Fri, 2020-01-31 at 07:50 +0100, Markus Armbruster wrote:
> Kevin Wolf  writes:
> > Much of this threads plays with the though that maybe we don't need any
> > compatibility and make the radical conclusion that we don't need any
> > human-friendly interface at all. Keeping full compatibility is the other
> > extreme.
> > 
> > There might be some middle ground where we break compatibility where the
> > old way can't easily be maintained with the new infrastructure, but
> > don't give up on the idea of being used by humans.
> 
> I'm not sure the connection between maintaining compatibility and
> supporting human use is as strong as you seem to imply.
> 
> As far as I can tell, the "maybe we don't need any compatibility"
> discussion is about the CLI.  I'd rephrase it as "maybe we need a
> machine-friendly CLI on par with QMP more than we need compatibility to
> the current CLI".
> 
> "We don't need any human-friendly interface at all" comes in not because
> machine-friendly necessarily precludes human-friendly, but only if we're
> unwilling (unable?) to do the extra work for it.
> 
> Compare the monitor:
> 
> * QMP is primarily for machines.  We promise stability: no incompatible
>   changes without clear communicaton of intent and a grace period.  We
>   provide machine clients tools to deal with the interface evolution,
>   e.g. query-qmp-schema.
> 
> * HMP is exclusively for humans.  It may change at any time.
> 
> For the CLI, we don't have such a separation, and our offerings for
> dealing with interface evolution are wholly inadequate.  We *need* to do
> better for machines.
> 
> Now, the monitor also informs us about the cost of providing a
> completely separate interface for humans.
> 
> Elsewhere in this thread, we discussed layering (a replacement for) HMP
> on top of QMP cleanly, possibly in a separate process, possibly written
> in a high-level language like Python.
> 
> HMP predates QMP.  We reworked it so the HMP commands are implemented on
> top of the QMP commands, or at least on top of common helpers.  But this
> is not quite the same as layering HMP on top of QMP.
> 
> If we decide to radically break the CLI so we can start over, we get to
> decide whether and how to do a human-friendly CLI, in particular how it
> relates to the machine-friendly CLI.

Does a machine-friendly CLI need to exist at all? Once you decide
that throwing away the current one is acceptable, you might as well
reduce the maintainance burden by requiring that software only
communicates with QEMU via QMP.

Does a human-friendly CLI need to be part of QEMU? We have built so
much amazing infrastructure on top of QEMU, and as of today none of
that work is benefiting people who run it directly.

As a proof of concept, I have spent a couple of hours writing the
attached shell script, which I hope will illustrate my point.

Usage is extremely simple: just do something like

  $ ./virt-run debian-10-openstack-amd64.qcow2

and after a few seconds the guest display will appear on your screen.

Behind the scenes, it uses a number of existing high-level tools:

  * virt-inspector, to figure out what guest OS is installed in the
image;

  * virt-install, to produce a domain XML tailored to that specific
guest OS and to create the corresponding libvirt domain;

  * virt-viewer, to provide the UI.

All these tools use libvirt under the hood, and additionally
virt-install uses libosinfo to obtain information about the guest
OS, such as whether or not it supports Virtio devices and how much
memory it needs to run smoothly.

The result is that, if you run

  $ qemu-system-x86_64 -hda debian-10-openstack-amd64.qcow2

you will get

  * a single CPU emulated with TCG;

  * 128 MiB of memory;

  * emulated I/O devices;

whereas the script will give you

  * 2 CPUs accelerated with KVM;

  * 1 GiB of memory;

  * Virtio devices for pretty much everything, including a
virtio-rng device that will for example speed up the first boot
significantly if SSH keys need to be (re)created.

Unsurprisingly, performance is different: when QEMU is invoked
directly, the login prompt for this specific image shows up after
~40 seconds, whereas when we use the script it only takes ~13 seconds
to get there. And the command line is just as simple, if not more so!

All of the above was obtained by hastily cobbling together existing
tools with <100 lines of shell scripting. Imagine how much better it
could be if we actually put some serious work in!

With my argument hopefully demonstrated: I think an architecture akin
to the one Dan has outlined earlier[1] would be a great direction to
take. QEMU can continue to focus on its core competency, that is,
virtual hardware, and leave most of the user interaction up to the
software interacting with its JSON-based API.

Obviously QEMU developers, for their own use, could still benefit
from having access to a user interface that doesn't require either
rigging up libvirt support or messing with JSON

Re: [PATCH] ui/cocoa: Drop workarounds for pre-10.12 OSX

2020-02-03 Thread G 3

> Date: Sat,  1 Feb 2020 17:05:34 +
> From: Peter Maydell 
> To: qemu-devel@nongnu.org
> Cc: Gerd Hoffmann 
> Subject: [PATCH] ui/cocoa: Drop workarounds for pre-10.12 OSX
> Message-ID: <20200201170534.22123-1-peter.mayd...@linaro.org>
>
> Our official OSX support policy covers the last two released versions.
> Currently that is 10.14 and 10.15.  We also may work on older versions, but
> don't guarantee it.
>
> In commit 50290c002c045280f8d in mid-2019 we introduced some uses of
> CLOCK_MONOTONIC which incidentally broke compilation for pre-10.12 OSX
> versions (see LP:1861551). We don't intend to fix that, so we might
> as well drop the code in ui/cocoa.m which caters for pre-10.12
> versions as well. (For reference, 10.11 fell out of Apple extended
> security support in September 2018.)
>
> Signed-off-by: Peter Maydell 
> ---
> The bug report is recent, but this was also pointed out on
> the mailing list back in June 2019. Since nobody has cared
> to try to fix it, we clearly don't care about 10.11 in
> practice as well as in theory.]
> ---
>  ui/cocoa.m | 59 --
>  1 file changed, 59 deletions(-)
>
> diff --git a/ui/cocoa.m b/ui/cocoa.m
> index fbb5b1b45f..f9945bc712 100644
> --- a/ui/cocoa.m
> +++ b/ui/cocoa.m
> @@ -42,60 +42,10 @@
>  #include 
>  #include "hw/core/cpu.h"
>
> -#ifndef MAC_OS_X_VERSION_10_5
> -#define MAC_OS_X_VERSION_10_5 1050
> -#endif
> -#ifndef MAC_OS_X_VERSION_10_6
> -#define MAC_OS_X_VERSION_10_6 1060
> -#endif
> -#ifndef MAC_OS_X_VERSION_10_9
> -#define MAC_OS_X_VERSION_10_9 1090
> -#endif
> -#ifndef MAC_OS_X_VERSION_10_10
> -#define MAC_OS_X_VERSION_10_10 101000
> -#endif
> -#ifndef MAC_OS_X_VERSION_10_12
> -#define MAC_OS_X_VERSION_10_12 101200
> -#endif
>  #ifndef MAC_OS_X_VERSION_10_13
>  #define MAC_OS_X_VERSION_10_13 101300
>  #endif
>
> -/* macOS 10.12 deprecated many constants, #define the new names for older
> SDKs */
> -#if MAC_OS_X_VERSION_MAX_ALLOWED < MAC_OS_X_VERSION_10_12
> -#define NSEventMaskAny  NSAnyEventMask
> -#define NSEventModifierFlagCapsLock NSAlphaShiftKeyMask
> -#define NSEventModifierFlagShiftNSShiftKeyMask
> -#define NSEventModifierFlagCommand  NSCommandKeyMask
> -#define NSEventModifierFlagControl  NSControlKeyMask
> -#define NSEventModifierFlagOption   NSAlternateKeyMask
> -#define NSEventTypeFlagsChanged NSFlagsChanged
> -#define NSEventTypeKeyUpNSKeyUp
> -#define NSEventTypeKeyDown  NSKeyDown
> -#define NSEventTypeMouseMoved   NSMouseMoved
> -#define NSEventTypeLeftMouseDownNSLeftMouseDown
> -#define NSEventTypeRightMouseDown   NSRightMouseDown
> -#define NSEventTypeOtherMouseDown   NSOtherMouseDown
> -#define NSEventTypeLeftMouseDragged NSLeftMouseDragged
> -#define NSEventTypeRightMouseDraggedNSRightMouseDragged
> -#define NSEventTypeOtherMouseDraggedNSOtherMouseDragged
> -#define NSEventTypeLeftMouseUp  NSLeftMouseUp
> -#define NSEventTypeRightMouseUp NSRightMouseUp
> -#define NSEventTypeOtherMouseUp NSOtherMouseUp
> -#define NSEventTypeScrollWheel  NSScrollWheel
> -#define NSTextAlignmentCenter   NSCenterTextAlignment
> -#define NSWindowStyleMaskBorderless NSBorderlessWindowMask
> -#define NSWindowStyleMaskClosable   NSClosableWindowMask
> -#define NSWindowStyleMaskMiniaturizable NSMiniaturizableWindowMask
> -#define NSWindowStyleMaskTitled NSTitledWindowMask
> -#endif
> -/* 10.13 deprecates NSFileHandlingPanelOKButton in favour of
> - * NSModalResponseOK, which was introduced in 10.9. Define
> - * it for older versions.
> - */
> -#if MAC_OS_X_VERSION_MAX_ALLOWED < MAC_OS_X_VERSION_10_9
> -#define NSModalResponseOK NSFileHandlingPanelOKButton
> -#endif
>  /* 10.14 deprecates NSOnState and NSOffState in favor of
>   * NSControlStateValueOn/Off, which were introduced in 10.13.
>   * Define for older versions
> @@ -465,11 +415,7 @@ - (void) drawRect:(NSRect) rect
>  COCOA_DEBUG("QemuCocoaView: drawRect\n");
>
>  // get CoreGraphic context
> -#if MAC_OS_X_VERSION_MAX_ALLOWED < MAC_OS_X_VERSION_10_10
> -CGContextRef viewContextRef = [[NSGraphicsContext currentContext]
> graphicsPort];
> -#else
>  CGContextRef viewContextRef = [[NSGraphicsContext currentContext]
> CGContext];
> -#endif
>
>  CGContextSetInterpolationQuality (viewContextRef,
> kCGInterpolationNone);
>  CGContextSetShouldAntialias (viewContextRef, NO);
> @@ -1075,9 +1021,7 @@ - (void) raiseAllKeys
>   --
>  */
>  @interface QemuCocoaAppController : NSObject
> -#if (MAC_OS_X_VERSION_MAX_ALLOWED >= MAC_OS_X_VERSION_10_6)
>  NSApplicationDelegate>
> -#endif
>  {
>  }
>  - (void)doToggleFullScreen:(id)sender;
> @@ -1126,9 +1070,6 @@ - (id) init
>  [normalWindow setAcceptsMouseMovedEvents:YES];
>  [normalWindow setTitle:@"QEMU"];
>  [

Re: [PATCH v4 00/11] RFC: [for 5.0]: HMP monitor handlers refactoring

2020-02-03 Thread Dr. David Alan Gilbert

* Maxim Levitsky (mlevi...@redhat.com) wrote:
> This patch series is bunch of cleanups to the hmp monitor code.
> It mostly moves the blockdev related hmp handlers to its own file,
> and does some minor refactoring.
> 
> No functional changes expected.

You've still got the title marked as RFC - are you actually ready for
this log?

Dave

> 
> Changes from V1:
>* move the handlers to block/monitor/block-hmp-cmds.c
>* tiny cleanup for the commit messages
> 
> Changes from V2:
>* Moved all the function prototypes to new header (blockdev-hmp-cmds.h)
>* Set the license of blockdev-hmp-cmds.c to GPLv2+
>* Moved hmp_snapshot_* functions to blockdev-hmp-cmds.c
>* Moved hmp_drive_add_node to blockdev-hmp-cmds.c
>  (this change needed some new exports, thus in separate new patch)
>* Moved hmp_qemu_io and hmp_eject to blockdev-hmp-cmds.c
>* Added 'error:' prefix to vreport, and updated the iotests
>  This is invasive change, but really feels like the right one
>* Added minor refactoring patch that drops an unused #include
> 
> Changes from V3:
>* Dropped the error prefix patches for now due to fact that it seems
>  that libvirt doesn't need that after all. Oh well...
>  I'll send them in a separate series.
> 
>* Hopefully correctly merged the copyright info the new files
>  Both files are GPLv2 now (due to code from hmp.h/hmp-cmds.c)
> 
>* Addressed review feedback
>* Renamed the added header to block-hmp-cmds.h
> 
>* Got rid of checkpatch.pl warnings in the moved code
>  (cosmetic code changes only)
> 
>* I kept the reviewed-by tags, since the changes I did are minor.
>  I hope that this is right thing to do.
> 
> Best regards,
>   Maxim Levitsky
> 
> Maxim Levitsky (11):
>   usb/dev-storage: remove unused include
>   monitor/hmp: uninline add_init_drive
>   monitor/hmp: rename device-hotplug.c to block/monitor/block-hmp-cmds.c
>   monitor/hmp: move hmp_drive_del and hmp_commit to block-hmp-cmds.c
>   monitor/hmp: move hmp_drive_mirror and hmp_drive_backup to
> block-hmp-cmds.c Moved code was added after 2012-01-13, thus under
> GPLv2+
>   monitor/hmp: move hmp_block_job* to block-hmp-cmds.c
>   monitor/hmp: move hmp_snapshot_* to block-hmp-cmds.c
> hmp_snapshot_blkdev is from GPLv2 version of the hmp-cmds.c thus
> have to change the licence to GPLv2
>   monitor/hmp: move hmp_nbd_server* to block-hmp-cmds.c
>   monitor/hmp: move remaining hmp_block* functions to block-hmp-cmds.c
>   monitor/hmp: move hmp_info_block* to block-hmp-cmds.c
>   monitor/hmp: Move hmp_drive_add_node to block-hmp-cmds.c
> 
>  MAINTAINERS|1 +
>  Makefile.objs  |2 +-
>  block/Makefile.objs|1 +
>  block/monitor/Makefile.objs|1 +
>  block/monitor/block-hmp-cmds.c | 1002 
>  blockdev.c |  137 +
>  device-hotplug.c   |   91 ---
>  hw/usb/dev-storage.c   |1 -
>  include/block/block-hmp-cmds.h |   54 ++
>  include/block/block_int.h  |5 +-
>  include/monitor/hmp.h  |   24 -
>  include/sysemu/blockdev.h  |4 -
>  include/sysemu/sysemu.h|3 -
>  monitor/hmp-cmds.c |  769 
>  monitor/misc.c |1 +
>  15 files changed, 1072 insertions(+), 1024 deletions(-)
>  create mode 100644 block/monitor/Makefile.objs
>  create mode 100644 block/monitor/block-hmp-cmds.c
>  delete mode 100644 device-hotplug.c
>  create mode 100644 include/block/block-hmp-cmds.h
> 
> -- 
> 2.17.2
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [PATCH v3 00/18] APIC ID fixes for AMD EPYC CPU models

2020-02-03 Thread Babu Moger




On 2/3/20 8:59 AM, Igor Mammedov wrote:
> On Tue, 03 Dec 2019 18:36:54 -0600
> Babu Moger  wrote:
> 
>> This series fixes APIC ID encoding problems on AMD EPYC CPUs.
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1728166&data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&sdata=N%2FaBBZ8G3D1gCNvabVQ%2FraHvINazcVeEc9FWdxQAWmg%3D&reserved=0
>>
>> Currently, the APIC ID is decoded based on the sequence
>> sockets->dies->cores->threads. This works for most standard AMD and other
>> vendors' configurations, but this decoding sequence does not follow that of
>> AMD's APIC ID enumeration strictly. In some cases this can cause CPU topology
>> inconsistency.  When booting a guest VM, the kernel tries to validate the
>> topology, and finds it inconsistent with the enumeration of EPYC cpu models.
>>
>> To fix the problem we need to build the topology as per the Processor
>> Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
>> Processors. It is available at 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.amd.com%2Fsystem%2Ffiles%2FTechDocs%2F55570-B1_PUB.zip&data=02%7C01%7Cbabu.moger%40amd.com%7C50685202e372472d7b2c08d7a8b9afa6%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637163387802886193&sdata=McjyMS3A3x5Jr57VxJmHDyh5jumdybzW%2FwLtE4FAKHQ%3D&reserved=0
>>
>> Here is the text from the PPR.
>> Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], 
>> the
>> number of least significant bits in the Initial APIC ID that indicate core ID
>> within a processor, in constructing per-core CPUID masks.
>> Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
>> (MNC) that the processor could theoretically support, not the actual number 
>> of
>> cores that are actually implemented or enabled on the processor, as indicated
>> by Core::X86::Cpuid::SizeId[NC].
>> Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
>> • ApicId[6] = Socket ID.
>> • ApicId[5:4] = Node ID.
>> • ApicId[3] = Logical CCX L3 complex ID
>> • ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : 
>> {1'b0,LogicalCoreID[1:0]}
> 
> 
> After checking out all patches and some pondering, used here approach
> looks to me too intrusive for the task at hand especially where it
> comes to generic code.
> 
> (Ignore till  to see suggestion how to simplify without reading
> reasoning behind it first)
> 
> Lets look for a way to simplify it a little bit.
> 
> So problem we are trying to solve,
>  1: calculate APIC IDs based on cpu type (to e more specific: for EPYC based 
> CPUs)
>  2: it depends on knowing total number of numa nodes.
> 
> Externally workflow looks like following:
>   1. user provides -smp x,sockets,cores,...,maxcpus
>   that's used by possible_cpu_arch_ids() singleton to build list of
>   possible CPUs (which is available to user via command 
> 'hotpluggable-cpus')
> 
>   Hook could be called very early and possible_cpus data might be
>   not complete. It builds a list of possible CPUs which user could
>   modify later.
> 
>   2.1 user uses "-numa cpu,node-id=x,..." or legacy "-numa 
> node,node_id=x,cpus="
>   options to assign cpus to nodes, which is one way or another calling
>   machine_set_cpu_numa_node(). The later updates 'possible_cpus' list
>   with node information. It happens early when total number of nodes
>   is not available.
> 
>   2.2 user does not provide explicit node mappings for CPUs.
>   QEMU steps in and assigns possible cpus to nodes in 
> machine_numa_finish_cpu_init()
>   (using the same machine_set_cpu_numa_node()) right before calling boards
>   specific machine init(). At that time total number of nodes is known.
> 
> In 1 -- 2.1 cases, 'arch_id' in 'possible_cpus' list doesn't have to be 
> defined before
> boards init() is run.
> 
> In 2.2 case it calls get_default_cpu_node_id() -> 
> x86_get_default_cpu_node_id()
> which uses arch_id calculate numa node.
> But then question is: does it have to use APIC id or could it infer 'pkg_id',
> it's after, from ms->possible_cpus->cpus[i].props data?

Not sure if I got the question right. In this case because the numa
information is not provided all the cpus are assigned to only one node.
The apic id is used here to get the correct pkg_id.

>   
> With that out of the way APIC ID will be used only during board's init(),
> so board could update possible_cpus with valid APIC IDs at the start of
> x86_cpus_init().
> 
> 
> in nutshell it would be much easier to do following:
> 
>  1. make x86_get_default_cpu_node_id() APIC ID in-depended or
> if impossible as alternative recompute APIC IDs there if cpu
> type is EPYC based (since number of nodes is already known)
>  2. recompute APIC IDs in x86_cpus_init() if cpu type is EPYC based
> 
> this way one doesn't need to touc

Re: [PATCH v2 03/29] python/qemu: Add binutils::binary_get_version()

2020-02-03 Thread Wainer dos Santos Moschetta




On 1/29/20 7:23 PM, Philippe Mathieu-Daudé wrote:

Add a helper to query the version of a QEMU binary.
We simply send the 'query-version' command over a QMP
socket.
Introduce the PythonQemuCoreScripts class to test our
new helper.

Signed-off-by: Philippe Mathieu-Daudé 
---
  python/qemu/binutils.py  | 38 


I'm not sure about creating the file with that name, it reminds me to 
GNU Binutils rather than the QEMU binary. :)


Perhaps it could be named qemu_bin.py?

Another suggestion is to encapsulate the methods you propose in this 
series in an object. For example:


class QEMUBin:
    def __init__(self, bin_path):
    # Check bin exists.
    self.bin_path = bin_path

    def get_version(self):
    # binutils.binary_get_version() goes here.
    pass

    def get_arch(self):
    # binutils.binary_get_arch() goes here.
    pass

    def list_accel(self):
    # move accel.list_accel() method to here.
    pass

    def get_vm(self, args):
    # Return an QEMUMachine object...
    return QEMUMachine(self.bin_path, *args)

    def get_build_config_host(self):
    # Detect if self.bin_path is in a build directory,
    # attempt to read the host-config.mak and return
    # as hash. Or fail...
    pass



  tests/acceptance/core_scripts.py | 31 ++
  2 files changed, 69 insertions(+)
  create mode 100644 python/qemu/binutils.py
  create mode 100644 tests/acceptance/core_scripts.py

diff --git a/python/qemu/binutils.py b/python/qemu/binutils.py
new file mode 100644
index 00..96b200eef4
--- /dev/null
+++ b/python/qemu/binutils.py
@@ -0,0 +1,38 @@
+"""
+QEMU binary utility module:
+
+The binary utility module provides helpers to query QEMU binary for
+build-dependent configuration options at runtime.
+"""
+#
+# Copyright (c) 2020 Red Hat, Inc.
+#
+# Author:
+#  Philippe Mathieu-Daudé 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or later.
+# See the COPYING file in the top-level directory.
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import logging
+
+from .machine import QEMUMachine
+
+LOG = logging.getLogger(__name__)
+
+
+def binary_get_version(qemu_bin):
+'''
+Get QEMU binary version
+
+@param qemu_bin (str): path to the QEMU binary


It could check that qemu_bin file exists, otherwise raise an exception 
or return None.



+@return binary version (dictionary with major/minor/micro keys)
+'''
+with QEMUMachine(qemu_bin) as vm:
+vm.set_machine('none')
+vm.launch()
+res = vm.command('query-version')
+LOG.info(res)
+vm.shutdown()


Don't need this, the vm will be shutdown anyway (see QEMUMachine.__exit__())

Thanks!

- Wainer



+return res['qemu']
diff --git a/tests/acceptance/core_scripts.py b/tests/acceptance/core_scripts.py
new file mode 100644
index 00..3f253337cd
--- /dev/null
+++ b/tests/acceptance/core_scripts.py
@@ -0,0 +1,31 @@
+# Tests covering various python/qemu/ scripts
+#
+# Copyright (c) 2020 Red Hat, Inc.
+#
+# Author:
+#  Philippe Mathieu-Daudé 
+#
+# This work is licensed under the terms of the GNU GPL, version 2 or later.
+# See the COPYING file in the top-level directory.
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import sys
+import os
+import logging
+
+sys.path.append(os.path.join(os.path.dirname(__file__), '..', '..', 'python'))
+from avocado_qemu import Test
+from qemu.binutils import binary_get_version
+
+
+class PythonQemuCoreScripts(Test):
+
+def test_get_version(self):
+logger = logging.getLogger('core')
+version = binary_get_version(self.qemu_bin)
+logger.debug('version: {}'.format(version))
+# QMP 'query-version' introduced with QEMU v0.14
+self.assertGreaterEqual(version['major'], 0)
+if version['major'] == 0:
+self.assertGreaterEqual(version['minor'], 14)

Re: [PATCH v3 02/17] tests/docker: better handle symlinked libs

2020-02-03 Thread Robert Foley

On Mon, 3 Feb 2020 at 04:09, Alex Bennée  wrote:
> Subject: [PATCH v3 02/17] tests/docker: better handle symlinked libs
>
> When we are copying we want to ensure we grab the first
> resolution (the found in path section). However even that binary might
> be a symlink so lets make sure we chase the symlinks to copy the right
> binary to where it can be found.
>
> Signed-off-by: Alex Bennée 
> Reviewed-by: Philippe Mathieu-Daudé 
> Tested-by: Philippe Mathieu-Daudé 

Reviewed-by: Robert Foley

Re: [PATCH v2 1/2] qemu-img: Add --target-is-zero to convert

2020-02-03 Thread Eric Blake


On 2/3/20 12:20 PM, Vladimir Sementsov-Ogievskiy wrote:

24.01.2020 13:34, David Edmondson wrote:

In many cases the target of a convert operation is a newly provisioned
target that the user knows is blank (filled with zeroes). In this
situation there is no requirement for qemu-img to wastefully zero out
the entire device.

Add a new option, --target-is-zero, allowing the user to indicate that
an existing target device is already zero filled.


Hi! qemu-img.c part looks OK for me, but other doesn't apply on master now.


Correct. Patch 2/2 is now obsolete and no longer necessary, and patch 1 
needs some tweaks now that we don't have qemu-img.texi.




I like this thing, and I'd like to make similar option for backup job.


My followup patches to add an all-zero bit to qcow2 are also useful in 
this regard.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 1/3] m25p80: Convert to support tracing

2020-02-03 Thread Alistair Francis

On Mon, Feb 3, 2020 at 10:10 AM Guenter Roeck  wrote:
>
> While at it, add some trace messages to help debug problems
> seen when running the latest Linux kernel.
>
> Signed-off-by: Guenter Roeck 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/block/m25p80.c | 48 ---
>  hw/block/trace-events | 16 +++
>  2 files changed, 38 insertions(+), 26 deletions(-)
>
> diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> index 11ff5b9ad7..63e050d7d3 100644
> --- a/hw/block/m25p80.c
> +++ b/hw/block/m25p80.c
> @@ -32,17 +32,7 @@
>  #include "qemu/module.h"
>  #include "qemu/error-report.h"
>  #include "qapi/error.h"
> -
> -#ifndef M25P80_ERR_DEBUG
> -#define M25P80_ERR_DEBUG 0
> -#endif
> -
> -#define DB_PRINT_L(level, ...) do { \
> -if (M25P80_ERR_DEBUG > (level)) { \
> -fprintf(stderr,  ": %s: ", __func__); \
> -fprintf(stderr, ## __VA_ARGS__); \
> -} \
> -} while (0)
> +#include "trace.h"
>
>  /* Fields for FlashPartInfo->flags */
>
> @@ -574,7 +564,8 @@ static void flash_erase(Flash *s, int offset, FlashCMD 
> cmd)
>  abort();
>  }
>
> -DB_PRINT_L(0, "offset = %#x, len = %d\n", offset, len);
> +trace_m25p80_flash_erase(offset, len);
> +
>  if ((s->pi->flags & capa_to_assert) != capa_to_assert) {
>  qemu_log_mask(LOG_GUEST_ERROR, "M25P80: %d erase size not supported 
> by"
>" device\n", len);
> @@ -607,8 +598,7 @@ void flash_write8(Flash *s, uint32_t addr, uint8_t data)
>  }
>
>  if ((prev ^ data) & data) {
> -DB_PRINT_L(1, "programming zero to one! addr=%" PRIx32 "  %" PRIx8
> -   " -> %" PRIx8 "\n", addr, prev, data);
> +trace_m25p80_programming_zero_to_one(addr, prev, data);
>  }
>
>  if (s->pi->flags & EEPROM) {
> @@ -662,6 +652,9 @@ static void complete_collecting_data(Flash *s)
>
>  s->state = STATE_IDLE;
>
> +trace_m25p80_complete_collecting(s->cmd_in_progress, n, s->ear,
> + s->cur_addr);
> +
>  switch (s->cmd_in_progress) {
>  case DPP:
>  case QPP:
> @@ -825,7 +818,7 @@ static void reset_memory(Flash *s)
>  break;
>  }
>
> -DB_PRINT_L(0, "Reset done.\n");
> +trace_m25p80_reset_done();
>  }
>
>  static void decode_fast_read_cmd(Flash *s)
> @@ -941,9 +934,10 @@ static void decode_qio_read_cmd(Flash *s)
>
>  static void decode_new_cmd(Flash *s, uint32_t value)
>  {
> -s->cmd_in_progress = value;
>  int i;
> -DB_PRINT_L(0, "decoded new command:%x\n", value);
> +
> +s->cmd_in_progress = value;
> +trace_m25p80_command_decoded(value);
>
>  if (value != RESET_MEMORY) {
>  s->reset_enable = false;
> @@ -1042,7 +1036,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
>  break;
>
>  case JEDEC_READ:
> -DB_PRINT_L(0, "populated jedec code\n");
> +trace_m25p80_populated_jedec();
>  for (i = 0; i < s->pi->id_len; i++) {
>  s->data[i] = s->pi->id[i];
>  }
> @@ -1063,7 +1057,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
>  case BULK_ERASE_60:
>  case BULK_ERASE:
>  if (s->write_enable) {
> -DB_PRINT_L(0, "chip erase\n");
> +trace_m25p80_chip_erase();
>  flash_erase(s, 0, BULK_ERASE);
>  } else {
>  qemu_log_mask(LOG_GUEST_ERROR, "M25P80: chip erase with write "
> @@ -1184,7 +1178,7 @@ static int m25p80_cs(SSISlave *ss, bool select)
>  s->data_read_loop = false;
>  }
>
> -DB_PRINT_L(0, "%sselect\n", select ? "de" : "");
> +trace_m25p80_select(select ? "de" : "");
>
>  return 0;
>  }
> @@ -1194,19 +1188,20 @@ static uint32_t m25p80_transfer8(SSISlave *ss, 
> uint32_t tx)
>  Flash *s = M25P80(ss);
>  uint32_t r = 0;
>
> +trace_m25p80_transfer(s->state, s->len, s->needed_bytes, s->pos,
> +  s->cur_addr, (uint8_t)tx);
> +
>  switch (s->state) {
>
>  case STATE_PAGE_PROGRAM:
> -DB_PRINT_L(1, "page program cur_addr=%#" PRIx32 " data=%" PRIx8 "\n",
> -   s->cur_addr, (uint8_t)tx);
> +trace_m25p80_page_program(s->cur_addr, (uint8_t)tx);
>  flash_write8(s, s->cur_addr, (uint8_t)tx);
>  s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
>  break;
>
>  case STATE_READ:
>  r = s->storage[s->cur_addr];
> -DB_PRINT_L(1, "READ 0x%" PRIx32 "=%" PRIx8 "\n", s->cur_addr,
> -   (uint8_t)r);
> +trace_m25p80_read_byte(s->cur_addr, (uint8_t)r);
>  s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
>  break;
>
> @@ -1244,6 +1239,7 @@ static uint32_t m25p80_transfer8(SSISlave *ss, uint32_t 
> tx)
>  }
>
>  r = s->data[s->pos];
> +trace_m25p80_read_data(s->pos, (uint8_t)r);
>  s->pos++;
>  if (s->pos == s->len) {
>  s->pos = 0;
> @@ -1281,7 +1277,7 @@ static void m25p80_realize(SSI

Re: [PATCH 1/6] hw/arm/raspi: Use BCM2708 machine type with pre Device Tree kernels

2020-02-03 Thread Alistair Francis

On Mon, Feb 3, 2020 at 12:28 AM Philippe Mathieu-Daudé  wrote:
>
> When booting without device tree, the Linux kernels uses the $R1
> register to determine the machine type. The list of values is
> registered at [1].
>
> There are two entries for the Raspberry Pi:
>
> - https://www.arm.linux.org.uk/developer/machines/list.php?mid=3138
>   name: MACH_TYPE_BCM2708
>   value: 0xc42 (3138)
>   status: Active, not mainlined
>   date: 15 Oct 2010
>
> - https://www.arm.linux.org.uk/developer/machines/list.php?mid=4828
>   name: MACH_TYPE_BCM2835
>   value: 4828
>   status: Active, mainlined
>   date: 6 Dec 2013
>
> QEMU always used the non-mainlined type MACH_TYPE_BCM2708.
> The value 0xc43 is registered to 'MX51_GGC' (processor i.MX51), and
> 0xc44 to 'Western Digital Sharespace NAS' (processor Marvell 88F5182).
>
> The Raspberry Pi foundation bootloader only sets the BCM2708 machine
> type, see [2] or [3]:
>
>  133 9:
>  134 mov r0, #0
>  135 ldr r1, =3138   @ BCM2708 machine id
>  136 ldr r2, atags   @ ATAGS
>  137 bx  r4
>
> U-Boot only uses MACH_TYPE_BCM2708 (see [4]):
>
>  25 /*
>  26  * 2835 is a SKU in a series for which the 2708 is the first or primary 
> SoC,
>  27  * so 2708 has historically been used rather than a dedicated 2835 ID.
>  28  *
>  29  * We don't define a machine type for bcm2709/bcm2836 since the RPi 
> Foundation
>  30  * chose to use someone else's previously registered machine ID (3139, 
> MX51_GGC)
>  31  * rather than obtaining a valid ID:-/
>  32  *
>  33  * For the bcm2837, hopefully a machine type is not needed, since 
> everything
>  34  * is DT.
>  35  */
>
> While the definition MACH_BCM2709 with value 0xc43 was introduced in
> a commit described "Add 2709 platform for Raspberry Pi 2" out of the
> mainline Linux kernel, it does not seem used, and the platform is
> introduced with Device Tree support anyway (see [5] and [6]).
>
> Remove the unused values (0xc43 introduced in commit 1df7d1f9303aef
> "raspi: add raspberry pi 2 machine" and 0xc44 in commit bade58166f4
> "raspi: Raspberry Pi 3 support"), keeping only MACH_TYPE_BCM2708.
>
> [1] https://www.arm.linux.org.uk/developer/machines/
> [2] 
> https://github.com/raspberrypi/tools/blob/920c7ed2e/armstubs/armstub7.S#L135
> [3] 
> https://github.com/raspberrypi/tools/blob/49719d554/armstubs/armstub7.S#L64
> [4] 
> https://gitlab.denx.de/u-boot/u-boot/blob/v2015.04/include/configs/rpi-common.h#L18
> [5] 
> https://github.com/raspberrypi/linux/commit/d9fac63adac#diff-6722037d79570df5b392a49e0e006573R526
> [6] 
> http://lists.infradead.org/pipermail/linux-rpi-kernel/2015-February/001268.html
>
> Signed-off-by: Philippe Mathieu-Daudé 

Reviewed-by: Alistair Francis 

Alistair

> ---
> Cc: Zoltán Baldaszti 
> Cc: Pekka Enberg 
> Cc: Stephen Warren 
> Cc: Kshitij Soni 
> Cc: Michael Chan 
> Cc: Andrew Baumann 
> Cc: Jeremy Linton 
> Cc: Pete Batard 
> ---
>  hw/arm/raspi.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
> index 3996f6c63a..ef76a27f33 100644
> --- a/hw/arm/raspi.c
> +++ b/hw/arm/raspi.c
> @@ -29,8 +29,7 @@
>  #define FIRMWARE_ADDR_3 0x8 /* Pi 3 loads kernel.img here by default */
>  #define SPINTABLE_ADDR  0xd8 /* Pi 3 bootloader spintable */
>
> -/* Table of Linux board IDs for different Pi versions */
> -static const int raspi_boardid[] = {[1] = 0xc42, [2] = 0xc43, [3] = 0xc44};
> +#define MACH_TYPE_BCM2708   3138 /* Linux board IDs */
>
>  typedef struct RasPiState {
>  BCM283XState soc;
> @@ -116,7 +115,7 @@ static void setup_boot(MachineState *machine, int 
> version, size_t ram_size)
>  static struct arm_boot_info binfo;
>  int r;
>
> -binfo.board_id = raspi_boardid[version];
> +binfo.board_id = MACH_TYPE_BCM2708;
>  binfo.ram_size = ram_size;
>  binfo.nb_cpus = machine->smp.cpus;
>
> --
> 2.21.1
>
>

[PATCH v1 12/13] util: oslib: Resizable anonymous allocations under POSIX

2020-02-03 Thread David Hildenbrand

Introduce qemu_anon_ram_alloc_resizable() and qemu_anon_ram_resize().
Implement them under POSIX and make them return NULL under WIN32.

Under POSIX, we make use of resizable mmaps. An implementation under
WIN32 is theoretically possible AFAIK and can be added later.

In qemu_anon_ram_free(), rename the size parameter to max_size, to make
it clearer that we have to use the maximum size when freeing resizable
anonymous allocations.

Cc: Richard Henderson 
Cc: Paolo Bonzini 
Cc: "Dr. David Alan Gilbert" 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: Stefan Weil 
Signed-off-by: David Hildenbrand 
---
 include/qemu/osdep.h |  6 +-
 util/oslib-posix.c   | 37 ++---
 util/oslib-win32.c   | 14 ++
 util/trace-events|  4 +++-
 4 files changed, 56 insertions(+), 5 deletions(-)

diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
index 9bd3dcfd13..57b7f40f56 100644
--- a/include/qemu/osdep.h
+++ b/include/qemu/osdep.h
@@ -311,8 +311,12 @@ int qemu_daemon(int nochdir, int noclose);
 void *qemu_try_memalign(size_t alignment, size_t size);
 void *qemu_memalign(size_t alignment, size_t size);
 void *qemu_anon_ram_alloc(size_t size, uint64_t *align, bool shared);
+void *qemu_anon_ram_alloc_resizable(size_t size, size_t max_size,
+uint64_t *align, bool shared);
+void *qemu_anon_ram_resize(void *ptr, size_t old_size, size_t new_size,
+   bool shared);
 void qemu_vfree(void *ptr);
-void qemu_anon_ram_free(void *ptr, size_t size);
+void qemu_anon_ram_free(void *ptr, size_t max_size);
 
 #define QEMU_MADV_INVALID -1
 
diff --git a/util/oslib-posix.c b/util/oslib-posix.c
index 5a291cc982..e487a0e2c2 100644
--- a/util/oslib-posix.c
+++ b/util/oslib-posix.c
@@ -219,16 +219,47 @@ void *qemu_anon_ram_alloc(size_t size, uint64_t 
*alignment, bool shared)
 return ptr;
 }
 
+void *qemu_anon_ram_alloc_resizable(size_t size, size_t max_size,
+uint64_t *alignment, bool shared)
+{
+size_t align = QEMU_VMALLOC_ALIGN;
+void *ptr = qemu_ram_mmap_resizable(-1, size, max_size, align, shared,
+false);
+
+if (ptr == MAP_FAILED) {
+return NULL;
+}
+
+if (alignment) {
+*alignment = align;
+}
+
+trace_qemu_anon_ram_alloc_resizable(size, max_size, ptr);
+return ptr;
+}
+
+void *qemu_anon_ram_resize(void *ptr, size_t old_size, size_t new_size,
+   bool shared)
+{
+ptr = qemu_ram_mmap_resize(ptr, -1, old_size, new_size, shared, false);
+if (ptr == MAP_FAILED) {
+return NULL;
+}
+
+trace_qemu_anon_ram_resize(old_size, new_size, ptr);
+return ptr;
+}
+
 void qemu_vfree(void *ptr)
 {
 trace_qemu_vfree(ptr);
 free(ptr);
 }
 
-void qemu_anon_ram_free(void *ptr, size_t size)
+void qemu_anon_ram_free(void *ptr, size_t max_size)
 {
-trace_qemu_anon_ram_free(ptr, size);
-qemu_ram_munmap(-1, ptr, size);
+trace_qemu_anon_ram_free(ptr, max_size);
+qemu_ram_munmap(-1, ptr, max_size);
 }
 
 void qemu_set_block(int fd)
diff --git a/util/oslib-win32.c b/util/oslib-win32.c
index e9b14ab178..caec028041 100644
--- a/util/oslib-win32.c
+++ b/util/oslib-win32.c
@@ -90,6 +90,20 @@ void *qemu_anon_ram_alloc(size_t size, uint64_t *align, bool 
shared)
 return ptr;
 }
 
+void *qemu_anon_ram_alloc_resizable(size_t size, size_t max_size,
+uint64_t *align, bool shared)
+{
+/* resizable ram not implemented yet */
+return NULL;
+}
+
+void *qemu_anon_ram_resize(void *ptr, size_t old_size, size_t new_size,
+   bool shared)
+{
+/* resizable ram not implemented yet */
+return NULL;
+}
+
 void qemu_vfree(void *ptr)
 {
 trace_qemu_vfree(ptr);
diff --git a/util/trace-events b/util/trace-events
index 226f406c46..05ec1eb9f3 100644
--- a/util/trace-events
+++ b/util/trace-events
@@ -46,8 +46,10 @@ qemu_co_mutex_unlock_return(void *mutex, void *self) "mutex 
%p self %p"
 # oslib-posix.c
 qemu_memalign(size_t alignment, size_t size, void *ptr) "alignment %zu size 
%zu ptr %p"
 qemu_anon_ram_alloc(size_t size, void *ptr) "size %zu ptr %p"
+qemu_anon_ram_alloc_resizable(size_t size, size_t max_size, void *ptr) "size 
%zu max_size %zu ptr %p"
+qemu_anon_ram_resize(size_t old_size, size_t new_size, void *ptr) "old_size 
%zu new_size %zu ptr %p"
 qemu_vfree(void *ptr) "ptr %p"
-qemu_anon_ram_free(void *ptr, size_t size) "ptr %p size %zu"
+qemu_anon_ram_free(void *ptr, size_t max_size) "ptr %p max_size %zu"
 
 # hbitmap.c
 hbitmap_iter_skip_words(const void *hb, void *hbi, uint64_t pos, unsigned long 
cur) "hb %p hbi %p pos %"PRId64" cur 0x%lx"
-- 
2.24.1

[PATCH v1 11/13] util: vfio-helpers: Implement ram_block_resized()

2020-02-03 Thread David Hildenbrand

Let's implement ram_block_resized().

Note: Resizing is currently only allowed during reboot or when migration
starts.

Cc: "Dr. David Alan Gilbert" 
Cc: Paolo Bonzini 
Cc: Markus Armbruster 
Cc: Alex Williamson 
Signed-off-by: David Hildenbrand 
---
 util/vfio-helpers.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 71e02e7f35..57d77e9480 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -395,11 +395,24 @@ static void qemu_vfio_ram_block_removed(RAMBlockNotifier 
*n,
 }
 }
 
+static void qemu_vfio_ram_block_resized(RAMBlockNotifier *n, void *host,
+size_t oldsize, size_t newsize)
+{
+QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
+if (host) {
+trace_qemu_vfio_ram_block_resized(s, host, oldsize, newsize);
+/* Note: Not atomic - we need a new ioctl for that. */
+qemu_vfio_ram_block_removed(n, host, oldsize);
+qemu_vfio_ram_block_added(n, host, newsize);
+}
+}
+
 static void qemu_vfio_open_common(QEMUVFIOState *s)
 {
 qemu_mutex_init(&s->lock);
 s->ram_notifier.ram_block_added = qemu_vfio_ram_block_added;
 s->ram_notifier.ram_block_removed = qemu_vfio_ram_block_removed;
+s->ram_notifier.ram_block_resized = qemu_vfio_ram_block_resized;
 s->low_water_mark = QEMU_VFIO_IOVA_MIN;
 s->high_water_mark = QEMU_VFIO_IOVA_MAX;
 ram_block_notifier_add(&s->ram_notifier);
-- 
2.24.1

[PATCH v1 13/13] exec: Ram blocks with resizable anonymous allocations under POSIX

2020-02-03 Thread David Hildenbrand

We can now make use of resizable anonymous allocations to implement
actually resizable ram blocks. Resizable anonymous allocations are
not implemented under WIN32 yet and are not available when using
alternative allocators. Fall back to the existing handling.

We also have to fallback to the existing handling in case any ram block
notifier does not support resizing (esp., AMD SEV, HAX) yet. Remember
in RAM_RESIZEABLE_ALLOC if we are using resizable anonymous allocations.

As the mmap()-hackery will invalidate some madvise settings, we have to
re-apply them after resizing. After resizing, notify the ram block
notifiers.

The benefit of actually resizable ram blocks is that e.g., under Linux,
only the actual size will be reserved (even if
"/proc/sys/vm/overcommit_memory" is set to "never"). Additional memory will
be reserved when trying to resize, which allows to have ram blocks that
start small but can theoretically grow very large.

Cc: Richard Henderson 
Cc: Paolo Bonzini 
Cc: "Dr. David Alan Gilbert" 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: Stefan Weil 
Signed-off-by: David Hildenbrand 
---
 exec.c| 68 +++
 hw/core/numa.c| 10 --
 include/exec/cpu-common.h |  2 ++
 include/exec/memory.h |  8 +
 4 files changed, 79 insertions(+), 9 deletions(-)

diff --git a/exec.c b/exec.c
index fc65c4f7ca..a59d1efde3 100644
--- a/exec.c
+++ b/exec.c
@@ -2053,6 +2053,16 @@ void qemu_ram_unset_migratable(RAMBlock *rb)
 rb->flags &= ~RAM_MIGRATABLE;
 }
 
+bool qemu_ram_is_resizable(RAMBlock *rb)
+{
+return rb->flags & RAM_RESIZEABLE;
+}
+
+bool qemu_ram_is_resizable_alloc(RAMBlock *rb)
+{
+return rb->flags & RAM_RESIZEABLE_ALLOC;
+}
+
 /* Called with iothread lock held.  */
 void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState 
*dev)
 {
@@ -2139,6 +2149,8 @@ static void qemu_ram_apply_settings(void *host, size_t 
length)
  */
 int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, Error **errp)
 {
+const uint64_t oldsize = block->used_length;
+
 assert(block);
 
 newsize = HOST_PAGE_ALIGN(newsize);
@@ -2147,7 +2159,7 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t newsize, 
Error **errp)
 return 0;
 }
 
-if (!(block->flags & RAM_RESIZEABLE)) {
+if (!qemu_ram_is_resizable(block)) {
 error_setg_errno(errp, EINVAL,
  "Length mismatch: %s: 0x" RAM_ADDR_FMT
  " in != 0x" RAM_ADDR_FMT, block->idstr,
@@ -2163,10 +2175,26 @@ int qemu_ram_resize(RAMBlock *block, ram_addr_t 
newsize, Error **errp)
 return -EINVAL;
 }
 
+if (qemu_ram_is_resizable_alloc(block)) {
+g_assert(ram_block_notifiers_support_resize());
+if (qemu_anon_ram_resize(block->host, block->used_length,
+ newsize, block->flags & RAM_SHARED) == NULL) {
+error_setg_errno(errp, -ENOMEM,
+ "Could not allocate enough memory.");
+return -ENOMEM;
+}
+}
+
 cpu_physical_memory_clear_dirty_range(block->offset, block->used_length);
 block->used_length = newsize;
 cpu_physical_memory_set_dirty_range(block->offset, block->used_length,
 DIRTY_CLIENTS_ALL);
+if (block->host && qemu_ram_is_resizable_alloc(block)) {
+/* re-apply settings that might have been overriden by the resize */
+qemu_ram_apply_settings(block->host, block->max_length);
+ram_block_notify_resized(block->host, oldsize, block->used_length);
+}
+
 memory_region_set_size(block->mr, newsize);
 if (block->resized) {
 block->resized(block->idstr, newsize, block->host);
@@ -2249,6 +2277,28 @@ static void dirty_memory_extend(ram_addr_t old_ram_size,
 }
 }
 
+static void ram_block_alloc_ram(RAMBlock *rb)
+{
+const bool shared = qemu_ram_is_shared(rb);
+
+/*
+ * If we can, try to allocate actually resizable ram. Will also fail
+ * if qemu_anon_ram_alloc_resizable() is not implemented.
+ */
+if (phys_mem_alloc == qemu_anon_ram_alloc &&
+qemu_ram_is_resizable(rb) &&
+ram_block_notifiers_support_resize()) {
+rb->host = qemu_anon_ram_alloc_resizable(rb->used_length,
+ rb->max_length, 
&rb->mr->align,
+ shared);
+if (rb->host) {
+rb->flags |= RAM_RESIZEABLE_ALLOC;
+return;
+}
+}
+rb->host = phys_mem_alloc(rb->max_length, &rb->mr->align, shared);
+}
+
 static void ram_block_add(RAMBlock *new_block, Error **errp)
 {
 RAMBlock *block;
@@ -2271,9 +2321,7 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp)
 return;
 }
 } else {
-new_block->host = phys_mem_alloc(new_block->max_length,
- &new_block->mr->al

[PATCH v1 04/13] exec: Drop "shared" parameter from ram_block_add()

2020-02-03 Thread David Hildenbrand

Properly store it in the flags of the ram block instead (and the flag
even already exists and is used).

E.g., qemu_ram_is_shared() now properly succeeds on all ram blocks that are
actually shared.

Cc: Richard Henderson 
Cc: Paolo Bonzini 
Signed-off-by: David Hildenbrand 
---
 exec.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/exec.c b/exec.c
index f7525867ec..fc65c4f7ca 100644
--- a/exec.c
+++ b/exec.c
@@ -2249,7 +2249,7 @@ static void dirty_memory_extend(ram_addr_t old_ram_size,
 }
 }
 
-static void ram_block_add(RAMBlock *new_block, Error **errp, bool shared)
+static void ram_block_add(RAMBlock *new_block, Error **errp)
 {
 RAMBlock *block;
 RAMBlock *last_block = NULL;
@@ -2272,7 +2272,8 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp, bool shared)
 }
 } else {
 new_block->host = phys_mem_alloc(new_block->max_length,
- &new_block->mr->align, shared);
+ &new_block->mr->align,
+ qemu_ram_is_shared(new_block));
 if (!new_block->host) {
 error_setg_errno(errp, errno,
  "cannot set up guest memory '%s'",
@@ -2376,7 +2377,7 @@ RAMBlock *qemu_ram_alloc_from_fd(ram_addr_t size, 
MemoryRegion *mr,
 return NULL;
 }
 
-ram_block_add(new_block, &local_err, ram_flags & RAM_SHARED);
+ram_block_add(new_block, &local_err);
 if (local_err) {
 g_free(new_block);
 error_propagate(errp, local_err);
@@ -2438,10 +2439,13 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, 
ram_addr_t max_size,
 if (host) {
 new_block->flags |= RAM_PREALLOC;
 }
+if (share) {
+new_block->flags |= RAM_SHARED;
+}
 if (resizeable) {
 new_block->flags |= RAM_RESIZEABLE;
 }
-ram_block_add(new_block, &local_err, share);
+ram_block_add(new_block, &local_err);
 if (local_err) {
 g_free(new_block);
 error_propagate(errp, local_err);
-- 
2.24.1

[PATCH v1 06/13] util/mmap-alloc: Factor out reserving of a memory region to mmap_reserve()

2020-02-03 Thread David Hildenbrand

We want to reserve a memory region without actually populating memory.
Let's factor that out.

Cc: "Michael S. Tsirkin" 
Cc: Greg Kurz 
Cc: Murilo Opsfelder Araujo 
Cc: Eduardo Habkost 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 util/mmap-alloc.c | 58 +++
 1 file changed, 33 insertions(+), 25 deletions(-)

diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 82f02a2cec..43a26f38a8 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -82,6 +82,38 @@ size_t qemu_mempath_getpagesize(const char *mem_path)
 return qemu_real_host_page_size;
 }
 
+/*
+ * Reserve a new memory region of the requested size to be used for mapping
+ * from the given fd (if any).
+ */
+static void *mmap_reserve(size_t size, int fd)
+{
+int flags = MAP_PRIVATE;
+
+#if defined(__powerpc64__) && defined(__linux__)
+/*
+ * On ppc64 mappings in the same segment (aka slice) must share the same
+ * page size. Since we will be re-allocating part of this segment
+ * from the supplied fd, we should make sure to use the same page size, to
+ * this end we mmap the supplied fd.  In this case, set MAP_NORESERVE to
+ * avoid allocating backing store memory.
+ * We do this unless we are using the system page size, in which case
+ * anonymous memory is OK.
+ */
+if (fd == -1 || qemu_fd_getpagesize(fd) == qemu_real_host_page_size) {
+fd = -1;
+flags |= MAP_ANONYMOUS;
+} else {
+flags |= MAP_NORESERVE;
+}
+#else
+fd = -1;
+flags |= MAP_ANONYMOUS;
+#endif
+
+return mmap(0, size, PROT_NONE, flags, fd, 0);
+}
+
 static inline size_t mmap_pagesize(int fd)
 {
 #if defined(__powerpc64__) && defined(__linux__)
@@ -101,7 +133,6 @@ void *qemu_ram_mmap(int fd,
 const size_t pagesize = mmap_pagesize(fd);
 int flags;
 int map_sync_flags = 0;
-int guardfd;
 size_t offset;
 size_t total;
 void *guardptr;
@@ -113,30 +144,7 @@ void *qemu_ram_mmap(int fd,
  */
 total = size + align;
 
-#if defined(__powerpc64__) && defined(__linux__)
-/* On ppc64 mappings in the same segment (aka slice) must share the same
- * page size. Since we will be re-allocating part of this segment
- * from the supplied fd, we should make sure to use the same page size, to
- * this end we mmap the supplied fd.  In this case, set MAP_NORESERVE to
- * avoid allocating backing store memory.
- * We do this unless we are using the system page size, in which case
- * anonymous memory is OK.
- */
-flags = MAP_PRIVATE;
-if (fd == -1 || pagesize == qemu_real_host_page_size) {
-guardfd = -1;
-flags |= MAP_ANONYMOUS;
-} else {
-guardfd = fd;
-flags |= MAP_NORESERVE;
-}
-#else
-guardfd = -1;
-flags = MAP_PRIVATE | MAP_ANONYMOUS;
-#endif
-
-guardptr = mmap(0, total, PROT_NONE, flags, guardfd, 0);
-
+guardptr = mmap_reserve(total, fd);
 if (guardptr == MAP_FAILED) {
 return MAP_FAILED;
 }
-- 
2.24.1

[PATCH v1 10/13] numa: Introduce ram_block_notify_resized() and ram_block_notifiers_support_resize()

2020-02-03 Thread David Hildenbrand

We want to actually resize ram blocks (make everything between
used_length and max_length inaccessible) - however, not all ram block
notifiers will support that.

So introduce a way to detect if any registered notifier does not support
it and add a way to notify all notifiers that support it.

Using ram_block_notifiers_support_resize(), we can keep the existing
handling for these special cases until they implement support (e.g.,
SEV, HAX) to resize.

Cc: Richard Henderson 
Cc: Paolo Bonzini 
Cc: "Dr. David Alan Gilbert" 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Signed-off-by: David Hildenbrand 
---
 hw/core/numa.c | 21 +
 include/exec/ramlist.h |  4 
 util/trace-events  |  1 +
 3 files changed, 26 insertions(+)

diff --git a/hw/core/numa.c b/hw/core/numa.c
index fed4046680..5ccfcbcd41 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -940,3 +940,24 @@ void ram_block_notify_remove(void *host, size_t size)
 notifier->ram_block_removed(notifier, host, size);
 }
 }
+
+void ram_block_notify_resized(void *host, size_t oldsize, size_t newsize)
+{
+RAMBlockNotifier *notifier;
+
+QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
+notifier->ram_block_resized(notifier, host, oldsize, newsize);
+}
+}
+
+bool ram_block_notifiers_support_resize(void)
+{
+RAMBlockNotifier *notifier;
+
+QLIST_FOREACH(notifier, &ram_list.ramblock_notifiers, next) {
+if (!notifier->ram_block_resized) {
+return false;
+}
+}
+return true;
+}
diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h
index bc4faa1b00..33a380cbee 100644
--- a/include/exec/ramlist.h
+++ b/include/exec/ramlist.h
@@ -67,6 +67,8 @@ void qemu_mutex_unlock_ramlist(void);
 struct RAMBlockNotifier {
 void (*ram_block_added)(RAMBlockNotifier *n, void *host, size_t size);
 void (*ram_block_removed)(RAMBlockNotifier *n, void *host, size_t size);
+void (*ram_block_resized)(RAMBlockNotifier *n, void *host, size_t oldsize,
+  size_t newsize);
 QLIST_ENTRY(RAMBlockNotifier) next;
 };
 
@@ -74,6 +76,8 @@ void ram_block_notifier_add(RAMBlockNotifier *n);
 void ram_block_notifier_remove(RAMBlockNotifier *n);
 void ram_block_notify_add(void *host, size_t size);
 void ram_block_notify_remove(void *host, size_t size);
+void ram_block_notify_resized(void *host, size_t oldsize, size_t newsize);
+bool ram_block_notifiers_support_resize(void);
 
 void ram_block_dump(Monitor *mon);
 
diff --git a/util/trace-events b/util/trace-events
index 83b6639018..226f406c46 100644
--- a/util/trace-events
+++ b/util/trace-events
@@ -76,6 +76,7 @@ qemu_mutex_unlock(void *mutex, const char *file, const int 
line) "released mutex
 qemu_vfio_dma_reset_temporary(void *s) "s %p"
 qemu_vfio_ram_block_added(void *s, void *p, size_t size) "s %p host %p size 
0x%zx"
 qemu_vfio_ram_block_removed(void *s, void *p, size_t size) "s %p host %p size 
0x%zx"
+qemu_vfio_ram_block_resized(void *s, void *p, size_t oldsize, size_t newsizze) 
"s %p host %p oldsize 0x%zx newsize 0x%zx"
 qemu_vfio_find_mapping(void *s, void *p) "s %p host %p"
 qemu_vfio_new_mapping(void *s, void *host, size_t size, int index, uint64_t 
iova) "s %p host %p size %zu index %d iova 0x%"PRIx64
 qemu_vfio_do_mapping(void *s, void *host, size_t size, uint64_t iova) "s %p 
host %p size %zu iova 0x%"PRIx64
-- 
2.24.1

[PATCH v1 05/13] util/mmap-alloc: Factor out calculation of pagesize to mmap_pagesize()

2020-02-03 Thread David Hildenbrand

Factor it out and add a comment.

Cc: "Michael S. Tsirkin" 
Cc: Murilo Opsfelder Araujo 
Cc: Greg Kurz 
Cc: Eduardo Habkost 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 util/mmap-alloc.c | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 27dcccd8ec..82f02a2cec 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -82,17 +82,27 @@ size_t qemu_mempath_getpagesize(const char *mem_path)
 return qemu_real_host_page_size;
 }
 
+static inline size_t mmap_pagesize(int fd)
+{
+#if defined(__powerpc64__) && defined(__linux__)
+/* Mappings in the same segment must share the same page size */
+return qemu_fd_getpagesize(fd);
+#else
+return qemu_real_host_page_size;
+#endif
+}
+
 void *qemu_ram_mmap(int fd,
 size_t size,
 size_t align,
 bool shared,
 bool is_pmem)
 {
+const size_t pagesize = mmap_pagesize(fd);
 int flags;
 int map_sync_flags = 0;
 int guardfd;
 size_t offset;
-size_t pagesize;
 size_t total;
 void *guardptr;
 void *ptr;
@@ -113,7 +123,6 @@ void *qemu_ram_mmap(int fd,
  * anonymous memory is OK.
  */
 flags = MAP_PRIVATE;
-pagesize = qemu_fd_getpagesize(fd);
 if (fd == -1 || pagesize == qemu_real_host_page_size) {
 guardfd = -1;
 flags |= MAP_ANONYMOUS;
@@ -123,7 +132,6 @@ void *qemu_ram_mmap(int fd,
 }
 #else
 guardfd = -1;
-pagesize = qemu_real_host_page_size;
 flags = MAP_PRIVATE | MAP_ANONYMOUS;
 #endif
 
@@ -198,15 +206,10 @@ void *qemu_ram_mmap(int fd,
 
 void qemu_ram_munmap(int fd, void *ptr, size_t size)
 {
-size_t pagesize;
+const size_t pagesize = mmap_pagesize(fd);
 
 if (ptr) {
 /* Unmap both the RAM block and the guard page */
-#if defined(__powerpc64__) && defined(__linux__)
-pagesize = qemu_fd_getpagesize(fd);
-#else
-pagesize = qemu_real_host_page_size;
-#endif
 munmap(ptr, size + pagesize);
 }
 }
-- 
2.24.1

[PATCH v1 02/13] exec: Factor out setting ram settings (madvise ...) into qemu_ram_apply_settings()

2020-02-03 Thread David Hildenbrand

Factor all settings out into qemu_ram_apply_settings().

For memory_try_enable_merging(), the important bit is that it won't be
called with XEN - which is now still the case as new_block->host will
remain NULL.

Cc: Richard Henderson 
Cc: Paolo Bonzini 
Signed-off-by: David Hildenbrand 
---
 exec.c | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/exec.c b/exec.c
index 05cfe868ab..31a462a7d3 100644
--- a/exec.c
+++ b/exec.c
@@ -2121,6 +2121,15 @@ static int memory_try_enable_merging(void *addr, size_t 
len)
 return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
 }
 
+static void qemu_ram_apply_settings(void *host, size_t length)
+{
+memory_try_enable_merging(host, length);
+qemu_ram_setup_dump(host, length);
+qemu_madvise(host, length, QEMU_MADV_HUGEPAGE);
+/* MADV_DONTFORK is also needed by KVM in absence of synchronous MMU */
+qemu_madvise(host, length, QEMU_MADV_DONTFORK);
+}
+
 /* Only legal before guest might have detected the memory size: e.g. on
  * incoming migration, or right after reset.
  *
@@ -2271,7 +2280,6 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp, bool shared)
 qemu_mutex_unlock_ramlist();
 return;
 }
-memory_try_enable_merging(new_block->host, new_block->max_length);
 }
 }
 
@@ -2309,10 +2317,7 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp, bool shared)
 DIRTY_CLIENTS_ALL);
 
 if (new_block->host) {
-qemu_ram_setup_dump(new_block->host, new_block->max_length);
-qemu_madvise(new_block->host, new_block->max_length, 
QEMU_MADV_HUGEPAGE);
-/* MADV_DONTFORK is also needed by KVM in absence of synchronous MMU */
-qemu_madvise(new_block->host, new_block->max_length, 
QEMU_MADV_DONTFORK);
+qemu_ram_apply_settings(new_block->host, new_block->max_length);
 ram_block_notify_add(new_block->host, new_block->max_length);
 }
 }
-- 
2.24.1

[PATCH v1 09/13] util/mmap-alloc: Implement resizable mmaps

2020-02-03 Thread David Hildenbrand

Implement resizable mmaps. For now, the actual resizing is not wired up.
Introduce qemu_ram_mmap_resizable() and qemu_ram_mmap_resize(). Make
qemu_ram_mmap() a wrapper of qemu_ram_mmap_resizable().

Cc: "Michael S. Tsirkin" 
Cc: Greg Kurz 
Cc: Murilo Opsfelder Araujo 
Cc: Eduardo Habkost 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 include/qemu/mmap-alloc.h | 21 ---
 util/mmap-alloc.c | 44 ---
 2 files changed, 45 insertions(+), 20 deletions(-)

diff --git a/include/qemu/mmap-alloc.h b/include/qemu/mmap-alloc.h
index e786266b92..70bc8e9637 100644
--- a/include/qemu/mmap-alloc.h
+++ b/include/qemu/mmap-alloc.h
@@ -7,11 +7,13 @@ size_t qemu_fd_getpagesize(int fd);
 size_t qemu_mempath_getpagesize(const char *mem_path);
 
 /**
- * qemu_ram_mmap: mmap the specified file or device.
+ * qemu_ram_mmap_resizable: reserve a memory region of @max_size to mmap the
+ *  specified file or device and mmap @size of it.
  *
  * Parameters:
  *  @fd: the file or the device to mmap
  *  @size: the number of bytes to be mmaped
+ *  @max_size: the number of bytes to be reserved
  *  @align: if not zero, specify the alignment of the starting mapping address;
  *  otherwise, the alignment in use will be determined by QEMU.
  *  @shared: map has RAM_SHARED flag.
@@ -21,12 +23,15 @@ size_t qemu_mempath_getpagesize(const char *mem_path);
  *  On success, return a pointer to the mapped area.
  *  On failure, return MAP_FAILED.
  */
-void *qemu_ram_mmap(int fd,
-size_t size,
-size_t align,
-bool shared,
-bool is_pmem);
-
-void qemu_ram_munmap(int fd, void *ptr, size_t size);
+void *qemu_ram_mmap_resizable(int fd, size_t size, size_t max_size,
+  size_t align, bool shared, bool is_pmem);
+void *qemu_ram_mmap_resize(void *ptr, int fd, size_t old_size, size_t new_size,
+   bool shared, bool is_pmem);
+static inline void *qemu_ram_mmap(int fd, size_t size, size_t align,
+  bool shared, bool is_pmem)
+{
+return qemu_ram_mmap_resizable(fd, size, size, align, shared, is_pmem);
+}
+void qemu_ram_munmap(int fd, void *ptr, size_t max_size);
 
 #endif
diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 63ad6893b7..2d562145e9 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -172,11 +172,8 @@ static inline size_t mmap_pagesize(int fd)
 #endif
 }
 
-void *qemu_ram_mmap(int fd,
-size_t size,
-size_t align,
-bool shared,
-bool is_pmem)
+void *qemu_ram_mmap_resizable(int fd, size_t size, size_t max_size,
+  size_t align, bool shared, bool is_pmem)
 {
 const size_t pagesize = mmap_pagesize(fd);
 size_t offset, total;
@@ -184,12 +181,14 @@ void *qemu_ram_mmap(int fd,
 
 /* we can only map whole pages */
 size = QEMU_ALIGN_UP(size, pagesize);
+max_size = QEMU_ALIGN_UP(max_size, pagesize);
 
 /*
  * Note: this always allocates at least one extra page of virtual address
- * space, even if size is already aligned.
+ * space, even if the size is already aligned. We will reserve an area of
+ * at least max_size, but only populate the requested part of it.
  */
-total = size + align;
+total = max_size + align;
 
 guardptr = mmap_reserve(0, total, fd);
 if (guardptr == MAP_FAILED) {
@@ -217,22 +216,43 @@ void *qemu_ram_mmap(int fd,
  * a guard page guarding against potential buffer overflows.
  */
 total -= offset;
-if (total > size + pagesize) {
-munmap(ptr + size + pagesize, total - size - pagesize);
+if (total > max_size + pagesize) {
+munmap(ptr + max_size + pagesize, total - max_size - pagesize);
 }
 
 return ptr;
 }
 
-void qemu_ram_munmap(int fd, void *ptr, size_t size)
+void *qemu_ram_mmap_resize(void *ptr, int fd, size_t old_size, size_t new_size,
+   bool shared, bool is_pmem)
 {
 const size_t pagesize = mmap_pagesize(fd);
 
 /* we can only map whole pages */
-size = QEMU_ALIGN_UP(size, pagesize);
+old_size = QEMU_ALIGN_UP(old_size, pagesize);
+new_size = QEMU_ALIGN_UP(new_size, pagesize);
+
+/* we support actually resizable memory regions only on Linux */
+if (old_size < new_size) {
+/* populate the missing piece into the reserved area */
+ptr = mmap_populate(ptr + old_size, new_size - old_size, fd, old_size,
+shared, is_pmem);
+} else if (old_size > new_size) {
+/* discard this piece, keeping the area reserved (should never fail) */
+ptr = mmap_reserve(ptr + new_size, old_size - new_size, fd);
+}
+return ptr;
+}
+
+void qemu_ram_munmap(int fd, void *ptr, size_t max_size)
+{
+const size_t pagesize = mmap_pagesize(fd);
+
+

[PATCH v1 08/13] util/mmap-alloc: Prepare for resizable mmaps

2020-02-03 Thread David Hildenbrand

When shrinking a mmap we want to re-reserve the already populated area.
When growing a memory region, we want to populate starting with a given
fd_offset. Prepare by allowing to pass these parameters.

Also, let's make sure we always process full pages, to avoid
unmapping/remapping pages that are already in use when
growing/shrinking. (existing callers seem to guarantee this, but that's
not obvious)

Cc: "Michael S. Tsirkin" 
Cc: Greg Kurz 
Cc: Murilo Opsfelder Araujo 
Cc: Eduardo Habkost 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 util/mmap-alloc.c | 32 +---
 1 file changed, 21 insertions(+), 11 deletions(-)

diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index f043ccb0ab..63ad6893b7 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -83,12 +83,12 @@ size_t qemu_mempath_getpagesize(const char *mem_path)
 }
 
 /*
- * Reserve a new memory region of the requested size to be used for mapping
- * from the given fd (if any).
+ * Reserve a new memory region of the requested size or re-reserve parts
+ * of an existing region to be used for mapping from the given fd (if any).
  */
-static void *mmap_reserve(size_t size, int fd)
+static void *mmap_reserve(void *ptr, size_t size, int fd)
 {
-int flags = MAP_PRIVATE;
+int flags = MAP_PRIVATE | (ptr ? MAP_FIXED : 0);
 
 #if defined(__powerpc64__) && defined(__linux__)
 /*
@@ -111,19 +111,23 @@ static void *mmap_reserve(size_t size, int fd)
 flags |= MAP_ANONYMOUS;
 #endif
 
-return mmap(0, size, PROT_NONE, flags, fd, 0);
+return mmap(ptr, size, PROT_NONE, flags, fd, 0);
 }
 
 /*
  * Populate memory in a reserved region from the given fd (if any).
  */
-static void *mmap_populate(void *ptr, size_t size, int fd, bool shared,
-   bool is_pmem)
+static void *mmap_populate(void *ptr, size_t size, int fd, size_t fd_offset,
+   bool shared, bool is_pmem)
 {
 int map_sync_flags = 0;
 int flags = MAP_FIXED;
 void *new_ptr;
 
+if (fd == -1) {
+fd_offset = 0;
+}
+
 flags |= fd == -1 ? MAP_ANONYMOUS : 0;
 flags |= shared ? MAP_SHARED : MAP_PRIVATE;
 if (shared && is_pmem) {
@@ -131,7 +135,7 @@ static void *mmap_populate(void *ptr, size_t size, int fd, 
bool shared,
 }
 
 new_ptr = mmap(ptr, size, PROT_READ | PROT_WRITE, flags | map_sync_flags,
-   fd, 0);
+   fd, fd_offset);
 if (new_ptr == MAP_FAILED && map_sync_flags) {
 if (errno == ENOTSUP) {
 char *proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
@@ -153,7 +157,7 @@ static void *mmap_populate(void *ptr, size_t size, int fd, 
bool shared,
  * If mmap failed with MAP_SHARED_VALIDATE | MAP_SYNC, we will try
  * again without these flags to handle backwards compatibility.
  */
-new_ptr = mmap(ptr, size, PROT_READ | PROT_WRITE, flags, fd, 0);
+new_ptr = mmap(ptr, size, PROT_READ | PROT_WRITE, flags, fd, 
fd_offset);
 }
 return new_ptr;
 }
@@ -178,13 +182,16 @@ void *qemu_ram_mmap(int fd,
 size_t offset, total;
 void *ptr, *guardptr;
 
+/* we can only map whole pages */
+size = QEMU_ALIGN_UP(size, pagesize);
+
 /*
  * Note: this always allocates at least one extra page of virtual address
  * space, even if size is already aligned.
  */
 total = size + align;
 
-guardptr = mmap_reserve(total, fd);
+guardptr = mmap_reserve(0, total, fd);
 if (guardptr == MAP_FAILED) {
 return MAP_FAILED;
 }
@@ -195,7 +202,7 @@ void *qemu_ram_mmap(int fd,
 
 offset = QEMU_ALIGN_UP((uintptr_t)guardptr, align) - (uintptr_t)guardptr;
 
-ptr = mmap_populate(guardptr + offset, size, fd, shared, is_pmem);
+ptr = mmap_populate(guardptr + offset, size, fd, 0, shared, is_pmem);
 if (ptr == MAP_FAILED) {
 munmap(guardptr, total);
 return MAP_FAILED;
@@ -221,6 +228,9 @@ void qemu_ram_munmap(int fd, void *ptr, size_t size)
 {
 const size_t pagesize = mmap_pagesize(fd);
 
+/* we can only map whole pages */
+size = QEMU_ALIGN_UP(size, pagesize);
+
 if (ptr) {
 /* Unmap both the RAM block and the guard page */
 munmap(ptr, size + pagesize);
-- 
2.24.1

[PATCH v1 07/13] util/mmap-alloc: Factor out populating of memory to mmap_populate()

2020-02-03 Thread David Hildenbrand

We want to populate memory within a reserved memory region. Let's factor
that out.

Cc: "Michael S. Tsirkin" 
Cc: Greg Kurz 
Cc: Murilo Opsfelder Araujo 
Cc: Eduardo Habkost 
Cc: "Dr. David Alan Gilbert" 
Signed-off-by: David Hildenbrand 
---
 util/mmap-alloc.c | 89 +--
 1 file changed, 47 insertions(+), 42 deletions(-)

diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 43a26f38a8..f043ccb0ab 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -114,6 +114,50 @@ static void *mmap_reserve(size_t size, int fd)
 return mmap(0, size, PROT_NONE, flags, fd, 0);
 }
 
+/*
+ * Populate memory in a reserved region from the given fd (if any).
+ */
+static void *mmap_populate(void *ptr, size_t size, int fd, bool shared,
+   bool is_pmem)
+{
+int map_sync_flags = 0;
+int flags = MAP_FIXED;
+void *new_ptr;
+
+flags |= fd == -1 ? MAP_ANONYMOUS : 0;
+flags |= shared ? MAP_SHARED : MAP_PRIVATE;
+if (shared && is_pmem) {
+map_sync_flags = MAP_SYNC | MAP_SHARED_VALIDATE;
+}
+
+new_ptr = mmap(ptr, size, PROT_READ | PROT_WRITE, flags | map_sync_flags,
+   fd, 0);
+if (new_ptr == MAP_FAILED && map_sync_flags) {
+if (errno == ENOTSUP) {
+char *proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
+char *file_name = g_malloc0(PATH_MAX);
+int len = readlink(proc_link, file_name, PATH_MAX - 1);
+
+if (len < 0) {
+len = 0;
+}
+file_name[len] = '\0';
+fprintf(stderr, "Warning: requesting persistence across crashes "
+"for backend file %s failed. Proceeding without "
+"persistence, data might become corrupted in case of host "
+"crash.\n", file_name);
+g_free(proc_link);
+g_free(file_name);
+}
+/*
+ * If mmap failed with MAP_SHARED_VALIDATE | MAP_SYNC, we will try
+ * again without these flags to handle backwards compatibility.
+ */
+new_ptr = mmap(ptr, size, PROT_READ | PROT_WRITE, flags, fd, 0);
+}
+return new_ptr;
+}
+
 static inline size_t mmap_pagesize(int fd)
 {
 #if defined(__powerpc64__) && defined(__linux__)
@@ -131,12 +175,8 @@ void *qemu_ram_mmap(int fd,
 bool is_pmem)
 {
 const size_t pagesize = mmap_pagesize(fd);
-int flags;
-int map_sync_flags = 0;
-size_t offset;
-size_t total;
-void *guardptr;
-void *ptr;
+size_t offset, total;
+void *ptr, *guardptr;
 
 /*
  * Note: this always allocates at least one extra page of virtual address
@@ -153,44 +193,9 @@ void *qemu_ram_mmap(int fd,
 /* Always align to host page size */
 assert(align >= pagesize);
 
-flags = MAP_FIXED;
-flags |= fd == -1 ? MAP_ANONYMOUS : 0;
-flags |= shared ? MAP_SHARED : MAP_PRIVATE;
-if (shared && is_pmem) {
-map_sync_flags = MAP_SYNC | MAP_SHARED_VALIDATE;
-}
-
 offset = QEMU_ALIGN_UP((uintptr_t)guardptr, align) - (uintptr_t)guardptr;
 
-ptr = mmap(guardptr + offset, size, PROT_READ | PROT_WRITE,
-   flags | map_sync_flags, fd, 0);
-
-if (ptr == MAP_FAILED && map_sync_flags) {
-if (errno == ENOTSUP) {
-char *proc_link, *file_name;
-int len;
-proc_link = g_strdup_printf("/proc/self/fd/%d", fd);
-file_name = g_malloc0(PATH_MAX);
-len = readlink(proc_link, file_name, PATH_MAX - 1);
-if (len < 0) {
-len = 0;
-}
-file_name[len] = '\0';
-fprintf(stderr, "Warning: requesting persistence across crashes "
-"for backend file %s failed. Proceeding without "
-"persistence, data might become corrupted in case of host "
-"crash.\n", file_name);
-g_free(proc_link);
-g_free(file_name);
-}
-/*
- * if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
- * we will remove these flags to handle compatibility.
- */
-ptr = mmap(guardptr + offset, size, PROT_READ | PROT_WRITE,
-   flags, fd, 0);
-}
-
+ptr = mmap_populate(guardptr + offset, size, fd, shared, is_pmem);
 if (ptr == MAP_FAILED) {
 munmap(guardptr, total);
 return MAP_FAILED;
-- 
2.24.1

[PATCH v1 01/13] util: vfio-helpers: Factor out and fix processing of existings ram blocks

2020-02-03 Thread David Hildenbrand

Factor it out into common code when a new notifier is registered, just
as done with the memory region notifier. This allows us to have the
logic about how to process existing ram blocks at a central place (which
will be extended soon).

Just like when adding a new ram block, we have to register the max_length.
We don't have a way to get notified about resizes yet, and some memory
would not be mapped when growing the ram block.

Note: Currently, ram blocks are only "fake resized". All memory
(max_length) is accessible.

We can get rid of a bunch of functions in stubs/ram-block.c . Print the
warning from inside qemu_vfio_ram_block_added().

Cc: Richard Henderson 
Cc: Paolo Bonzini 
Cc: Eduardo Habkost 
Cc: Marcel Apfelbaum 
Cc: Alex Williamson 
Signed-off-by: David Hildenbrand 
---
 exec.c|  5 +
 hw/core/numa.c| 14 ++
 include/exec/cpu-common.h |  1 +
 stubs/ram-block.c | 20 
 util/vfio-helpers.c   | 28 +++-
 5 files changed, 27 insertions(+), 41 deletions(-)

diff --git a/exec.c b/exec.c
index 67e520d18e..05cfe868ab 100644
--- a/exec.c
+++ b/exec.c
@@ -2017,6 +2017,11 @@ ram_addr_t qemu_ram_get_used_length(RAMBlock *rb)
 return rb->used_length;
 }
 
+ram_addr_t qemu_ram_get_max_length(RAMBlock *rb)
+{
+return rb->max_length;
+}
+
 bool qemu_ram_is_shared(RAMBlock *rb)
 {
 return rb->flags & RAM_SHARED;
diff --git a/hw/core/numa.c b/hw/core/numa.c
index 0d1b4be76a..fed4046680 100644
--- a/hw/core/numa.c
+++ b/hw/core/numa.c
@@ -899,9 +899,23 @@ void query_numa_node_mem(NumaNodeMem node_mem[], 
MachineState *ms)
 }
 }
 
+static int ram_block_notify_add_single(RAMBlock *rb, void *opaque)
+{
+ram_addr_t size = qemu_ram_get_max_length(rb);
+void *host = qemu_ram_get_host_addr(rb);
+RAMBlockNotifier *notifier = opaque;
+
+if (host) {
+notifier->ram_block_added(notifier, host, size);
+}
+return 0;
+}
+
 void ram_block_notifier_add(RAMBlockNotifier *n)
 {
 QLIST_INSERT_HEAD(&ram_list.ramblock_notifiers, n, next);
+/* Notify about all existing ram blocks. */
+qemu_ram_foreach_block(ram_block_notify_add_single, n);
 }
 
 void ram_block_notifier_remove(RAMBlockNotifier *n)
diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h
index 81753bbb34..9760ac9068 100644
--- a/include/exec/cpu-common.h
+++ b/include/exec/cpu-common.h
@@ -59,6 +59,7 @@ const char *qemu_ram_get_idstr(RAMBlock *rb);
 void *qemu_ram_get_host_addr(RAMBlock *rb);
 ram_addr_t qemu_ram_get_offset(RAMBlock *rb);
 ram_addr_t qemu_ram_get_used_length(RAMBlock *rb);
+ram_addr_t qemu_ram_get_max_length(RAMBlock *rb);
 bool qemu_ram_is_shared(RAMBlock *rb);
 bool qemu_ram_is_uf_zeroable(RAMBlock *rb);
 void qemu_ram_set_uf_zeroable(RAMBlock *rb);
diff --git a/stubs/ram-block.c b/stubs/ram-block.c
index 73c0a3ee08..10855b52dd 100644
--- a/stubs/ram-block.c
+++ b/stubs/ram-block.c
@@ -2,21 +2,6 @@
 #include "exec/ramlist.h"
 #include "exec/cpu-common.h"
 
-void *qemu_ram_get_host_addr(RAMBlock *rb)
-{
-return 0;
-}
-
-ram_addr_t qemu_ram_get_offset(RAMBlock *rb)
-{
-return 0;
-}
-
-ram_addr_t qemu_ram_get_used_length(RAMBlock *rb)
-{
-return 0;
-}
-
 void ram_block_notifier_add(RAMBlockNotifier *n)
 {
 }
@@ -24,8 +9,3 @@ void ram_block_notifier_add(RAMBlockNotifier *n)
 void ram_block_notifier_remove(RAMBlockNotifier *n)
 {
 }
-
-int qemu_ram_foreach_block(RAMBlockIterFunc func, void *opaque)
-{
-return 0;
-}
diff --git a/util/vfio-helpers.c b/util/vfio-helpers.c
index 813f7ec564..71e02e7f35 100644
--- a/util/vfio-helpers.c
+++ b/util/vfio-helpers.c
@@ -376,8 +376,13 @@ static void qemu_vfio_ram_block_added(RAMBlockNotifier *n,
   void *host, size_t size)
 {
 QEMUVFIOState *s = container_of(n, QEMUVFIOState, ram_notifier);
+int ret;
+
 trace_qemu_vfio_ram_block_added(s, host, size);
-qemu_vfio_dma_map(s, host, size, false, NULL);
+ret = qemu_vfio_dma_map(s, host, size, false, NULL);
+if (ret) {
+error_report("qemu_vfio_dma_map(%p, %zu) failed: %d", host, size, ret);
+}
 }
 
 static void qemu_vfio_ram_block_removed(RAMBlockNotifier *n,
@@ -390,33 +395,14 @@ static void qemu_vfio_ram_block_removed(RAMBlockNotifier 
*n,
 }
 }
 
-static int qemu_vfio_init_ramblock(RAMBlock *rb, void *opaque)
-{
-void *host_addr = qemu_ram_get_host_addr(rb);
-ram_addr_t length = qemu_ram_get_used_length(rb);
-int ret;
-QEMUVFIOState *s = opaque;
-
-if (!host_addr) {
-return 0;
-}
-ret = qemu_vfio_dma_map(s, host_addr, length, false, NULL);
-if (ret) {
-fprintf(stderr, "qemu_vfio_init_ramblock: failed %p %" PRId64 "\n",
-host_addr, (uint64_t)length);
-}
-return 0;
-}
-
 static void qemu_vfio_open_common(QEMUVFIOState *s)
 {
 qemu_mutex_init(&s->lock);
 s->ram_notifier.ram_block_added = qemu_vfio_ram_block_added;
 s->ram_notifier.ram_blo

[PATCH v1 00/13] Ram blocks with resizable anonymous allocations under POSIX

2020-02-03 Thread David Hildenbrand

We already allow resizable ram blocks for anonymous memory, however, they
are not actually resized. All memory is mmaped() R/W, including the memory
exceeding the used_length, up to the max_length.

When resizing, effectively only the boundary is moved. Implement actually
resizable anonymous allocations and make use of them in resizable ram
blocks when possible. Memory exceeding the used_length will be
inaccessible. Especially ram block notifiers require care.

Having actually resizable anonymous allocations (via mmap-hackery) allows
to reserve a big region in virtual address space and grow the
accessible/usable part on demand. Even if "/proc/sys/vm/overcommit_memory"
is set to "never" under Linux, huge reservations will succeed. If there is
not enough memory when resizing (to populate parts of the reserved region),
trying to resize will fail. Only the actually used size is reserved in the
OS.

E.g., virtio-mem [1] wants to reserve big resizable memory regions and
grow the usable part on demand. I think this change is worth sending out
individually. Accompanied by a bunch of minor fixes and cleanups.

[1] https://lore.kernel.org/kvm/20191212171137.13872-1-da...@redhat.com/

David Hildenbrand (13):
  util: vfio-helpers: Factor out and fix processing of existings ram
blocks
  exec: Factor out setting ram settings (madvise ...) into
qemu_ram_apply_settings()
  exec: Reuse qemu_ram_apply_settings() in qemu_ram_remap()
  exec: Drop "shared" parameter from ram_block_add()
  util/mmap-alloc: Factor out calculation of pagesize to mmap_pagesize()
  util/mmap-alloc: Factor out reserving of a memory region to
mmap_reserve()
  util/mmap-alloc: Factor out populating of memory to mmap_populate()
  util/mmap-alloc: Prepare for resizable mmaps
  util/mmap-alloc: Implement resizable mmaps
  numa: Introduce ram_block_notify_resized() and
ram_block_notifiers_support_resize()
  util: vfio-helpers: Implement ram_block_resized()
  util: oslib: Resizable anonymous allocations under POSIX
  exec: Ram blocks with resizable anonymous allocations under POSIX

 exec.c|  99 ++
 hw/core/numa.c|  39 +
 include/exec/cpu-common.h |   3 +
 include/exec/memory.h |   8 ++
 include/exec/ramlist.h|   4 +
 include/qemu/mmap-alloc.h |  21 +++--
 include/qemu/osdep.h  |   6 +-
 stubs/ram-block.c |  20 -
 util/mmap-alloc.c | 168 --
 util/oslib-posix.c|  37 -
 util/oslib-win32.c|  14 
 util/trace-events |   5 +-
 util/vfio-helpers.c   |  33 
 13 files changed, 331 insertions(+), 126 deletions(-)

-- 
2.24.1

[PATCH v1 03/13] exec: Reuse qemu_ram_apply_settings() in qemu_ram_remap()

2020-02-03 Thread David Hildenbrand

I don't see why we shouldn't apply all settings to make it look like the
surrounding RAM (and enable proper VMA merging).

Note: memory backend settings might have overridden these settings. We
would need a callback to let the memory backend fix that up.

Cc: Richard Henderson 
Cc: Paolo Bonzini 
Signed-off-by: David Hildenbrand 
---
 exec.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/exec.c b/exec.c
index 31a462a7d3..f7525867ec 100644
--- a/exec.c
+++ b/exec.c
@@ -2552,8 +2552,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
  length, addr);
 exit(1);
 }
-memory_try_enable_merging(vaddr, length);
-qemu_ram_setup_dump(vaddr, length);
+qemu_ram_apply_settings(vaddr, length);
 }
 }
 }
-- 
2.24.1

Re: [PATCH v3 01/18] hw/i386: Rename X86CPUTopoInfo structure to X86CPUTopoIDs

2020-02-03 Thread Babu Moger




On 2/3/20 9:08 AM, Igor Mammedov wrote:
> On Tue, 03 Dec 2019 18:37:01 -0600
> Babu Moger  wrote:
> 
>> Rename few data structures related to X86 topology.  X86CPUTopoIDs will
>> have individual arch ids. Next patch introduces X86CPUTopoInfo which will
>> have all topology information(like cores, threads etc..).
> 
> On what commit series was based on?
> (it doesn't apply to master anymore)

I used git://github.com/ehabkost/qemu.git (x86-next) to generate the
patches. It may be bit off right now.

> 
> 
>> Signed-off-by: Babu Moger 
>> Reviewed-by: Eduardo Habkost 
>> ---
>>  hw/i386/pc.c   |   60 
>> ++--
>>  include/hw/i386/topology.h |   40 +++--
>>  2 files changed, 50 insertions(+), 50 deletions(-)
>>
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index 51b72439b4..5bd2ffccb7 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -2212,7 +2212,7 @@ static void pc_cpu_pre_plug(HotplugHandler 
>> *hotplug_dev,
>>  int idx;
>>  CPUState *cs;
>>  CPUArchId *cpu_slot;
>> -X86CPUTopoInfo topo;
>> +X86CPUTopoIDs topo_ids;
>>  X86CPU *cpu = X86_CPU(dev);
>>  CPUX86State *env = &cpu->env;
>>  MachineState *ms = MACHINE(hotplug_dev);
>> @@ -2277,12 +2277,12 @@ static void pc_cpu_pre_plug(HotplugHandler 
>> *hotplug_dev,
>>  return;
>>  }
>>  
>> -topo.pkg_id = cpu->socket_id;
>> -topo.die_id = cpu->die_id;
>> -topo.core_id = cpu->core_id;
>> -topo.smt_id = cpu->thread_id;
>> +topo_ids.pkg_id = cpu->socket_id;
>> +topo_ids.die_id = cpu->die_id;
>> +topo_ids.core_id = cpu->core_id;
>> +topo_ids.smt_id = cpu->thread_id;
>>  cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
>> -smp_threads, &topo);
>> +smp_threads, &topo_ids);
>>  }
>>  
>>  cpu_slot = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, &idx);
>> @@ -2290,11 +2290,11 @@ static void pc_cpu_pre_plug(HotplugHandler 
>> *hotplug_dev,
>>  MachineState *ms = MACHINE(pcms);
>>  
>>  x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
>> - smp_cores, smp_threads, &topo);
>> + smp_cores, smp_threads, &topo_ids);
>>  error_setg(errp,
>>  "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
>>  " APIC ID %" PRIu32 ", valid index range 0:%d",
>> -topo.pkg_id, topo.die_id, topo.core_id, topo.smt_id,
>> +topo_ids.pkg_id, topo_ids.die_id, topo_ids.core_id, 
>> topo_ids.smt_id,
>>  cpu->apic_id, ms->possible_cpus->len - 1);
>>  return;
>>  }
>> @@ -2312,34 +2312,34 @@ static void pc_cpu_pre_plug(HotplugHandler 
>> *hotplug_dev,
>>   * once -smp refactoring is complete and there will be CPU private
>>   * CPUState::nr_cores and CPUState::nr_threads fields instead of 
>> globals */
>>  x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
>> - smp_cores, smp_threads, &topo);
>> -if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
>> + smp_cores, smp_threads, &topo_ids);
>> +if (cpu->socket_id != -1 && cpu->socket_id != topo_ids.pkg_id) {
>>  error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
>> -" 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
>> topo.pkg_id);
>> +" 0x%x (socket-id: %u)", cpu->socket_id, cpu->apic_id, 
>> topo_ids.pkg_id);
>>  return;
>>  }
>> -cpu->socket_id = topo.pkg_id;
>> +cpu->socket_id = topo_ids.pkg_id;
>>  
>> -if (cpu->die_id != -1 && cpu->die_id != topo.die_id) {
>> +if (cpu->die_id != -1 && cpu->die_id != topo_ids.die_id) {
>>  error_setg(errp, "property die-id: %u doesn't match set apic-id:"
>> -" 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, topo.die_id);
>> +" 0x%x (die-id: %u)", cpu->die_id, cpu->apic_id, 
>> topo_ids.die_id);
>>  return;
>>  }
>> -cpu->die_id = topo.die_id;
>> +cpu->die_id = topo_ids.die_id;
>>  
>> -if (cpu->core_id != -1 && cpu->core_id != topo.core_id) {
>> +if (cpu->core_id != -1 && cpu->core_id != topo_ids.core_id) {
>>  error_setg(errp, "property core-id: %u doesn't match set apic-id:"
>> -" 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, 
>> topo.core_id);
>> +" 0x%x (core-id: %u)", cpu->core_id, cpu->apic_id, 
>> topo_ids.core_id);
>>  return;
>>  }
>> -cpu->core_id = topo.core_id;
>> +cpu->core_id = topo_ids.core_id;
>>  
>> -if (cpu->thread_id != -1 && cpu->thread_id != topo.smt_id) {
>> +if (cpu->thread_id != -1 && cpu->thread_id != topo_ids.smt_id) {
>>  error_setg(errp, "property thread-id: %u doesn't match set apic-id:"
>> -

Re: [PATCH v2] block/backup-top: fix flags handling

2020-02-03 Thread Vladimir Sementsov-Ogievskiy


03.02.2020 18:42, Eric Blake wrote:

On 2/3/20 7:42 AM, Vladimir Sementsov-Ogievskiy wrote:

backup-top "supports" write-unchanged, by skipping CBW operation in
backup_top_co_pwritev. But it forgets to do the same in
backup_top_co_pwrite_zeroes, as well as declare support for
BDRV_REQ_WRITE_UNCHANGED.

Fix this, and, while being here, declare also support for flags
supported by source child.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---


Reviewed-by: Eric Blake 



Thanks!

--
Best regards,
Vladimir

Re: [PATCH v2 1/2] qemu-img: Add --target-is-zero to convert

2020-02-03 Thread Vladimir Sementsov-Ogievskiy


24.01.2020 13:34, David Edmondson wrote:

In many cases the target of a convert operation is a newly provisioned
target that the user knows is blank (filled with zeroes). In this
situation there is no requirement for qemu-img to wastefully zero out
the entire device.

Add a new option, --target-is-zero, allowing the user to indicate that
an existing target device is already zero filled.


Hi! qemu-img.c part looks OK for me, but other doesn't apply on master now.

I like this thing, and I'd like to make similar option for backup job.



Signed-off-by: David Edmondson 
---
  qemu-img-cmds.hx |  4 ++--
  qemu-img.c   | 25 ++---
  qemu-img.texi|  4 
  3 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 1c93e6d185..6f958a0a48 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -44,9 +44,9 @@ STEXI
  ETEXI
  
  DEF("convert", img_convert,

-"convert [--object objectdef] [--image-opts] [--target-image-opts] [-U] [-C] 
[-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O output_fmt] [-B backing_file] 
[-o options] [-l snapshot_param] [-S sparse_size] [-m num_coroutines] [-W] [--salvage] 
filename [filename2 [...]] output_filename")
+"convert [--object objectdef] [--image-opts] [--target-image-opts] 
[--target-is-zero] [-U] [-C] [-c] [-p] [-q] [-n] [-f fmt] [-t cache] [-T src_cache] [-O 
output_fmt] [-B backing_file] [-o options] [-l snapshot_param] [-S sparse_size] [-m 
num_coroutines] [-W] [--salvage] filename [filename2 [...]] output_filename")
  STEXI
-@item convert [--object @var{objectdef}] [--image-opts] [--target-image-opts] 
[-U] [-C] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t @var{cache}] [-T 
@var{src_cache}] [-O @var{output_fmt}] [-B @var{backing_file}] [-o 
@var{options}] [-l @var{snapshot_param}] [-S @var{sparse_size}] [-m 
@var{num_coroutines}] [-W] [--salvage] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
+@item convert [--object @var{objectdef}] [--image-opts] [--target-image-opts] 
[--target-is-zero] [-U] [-C] [-c] [-p] [-q] [-n] [-f @var{fmt}] [-t 
@var{cache}] [-T @var{src_cache}] [-O @var{output_fmt}] [-B @var{backing_file}] 
[-o @var{options}] [-l @var{snapshot_param}] [-S @var{sparse_size}] [-m 
@var{num_coroutines}] [-W] [--salvage] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
  ETEXI
  
  DEF("create", img_create,

diff --git a/qemu-img.c b/qemu-img.c
index 6233b8ca56..46db72dbe0 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -70,6 +70,7 @@ enum {
  OPTION_PREALLOCATION = 265,
  OPTION_SHRINK = 266,
  OPTION_SALVAGE = 267,
+OPTION_TARGET_IS_ZERO = 268,
  };
  
  typedef enum OutputFormat {

@@ -1984,10 +1985,9 @@ static int convert_do_copy(ImgConvertState *s)
  int64_t sector_num = 0;
  
  /* Check whether we have zero initialisation or can get it efficiently */

-if (s->target_is_new && s->min_sparse && !s->target_has_backing) {
+if (!s->has_zero_init && s->target_is_new && s->min_sparse &&
+!s->target_has_backing) {
  s->has_zero_init = bdrv_has_zero_init(blk_bs(s->target));
-} else {
-s->has_zero_init = false;
  }
  
  if (!s->has_zero_init && !s->target_has_backing &&

@@ -2086,6 +2086,7 @@ static int img_convert(int argc, char **argv)
  {"force-share", no_argument, 0, 'U'},
  {"target-image-opts", no_argument, 0, OPTION_TARGET_IMAGE_OPTS},
  {"salvage", no_argument, 0, OPTION_SALVAGE},
+{"target-is-zero", no_argument, 0, OPTION_TARGET_IS_ZERO},
  {0, 0, 0, 0}
  };
  c = getopt_long(argc, argv, ":hf:O:B:Cco:l:S:pt:T:qnm:WU",
@@ -2209,6 +2210,14 @@ static int img_convert(int argc, char **argv)
  case OPTION_TARGET_IMAGE_OPTS:
  tgt_image_opts = true;
  break;
+case OPTION_TARGET_IS_ZERO:
+/*
+ * The user asserting that the target is blank has the
+ * same effect as the target driver supporting zero
+ * initialisation.
+ */
+s.has_zero_init = true;
+break;
  }
  }
  
@@ -2247,6 +2256,11 @@ static int img_convert(int argc, char **argv)

  warn_report("This will become an error in future QEMU versions.");
  }
  
+if (s.has_zero_init && !skip_create) {

+error_report("--target-is-zero requires use of -n flag");
+goto fail_getopt;
+}
+
  s.src_num = argc - optind - 1;
  out_filename = s.src_num >= 1 ? argv[argc - 1] : NULL;
  
@@ -2380,6 +2394,11 @@ static int img_convert(int argc, char **argv)

  }
  s.target_has_backing = (bool) out_baseimg;
  
+if (s.has_zero_init && s.target_has_backing) {

+error_report("Cannot use --target-is-zero with a backing file");
+goto out;
+}
+
  if (s.src_num > 1 && out_baseimg) {
  error_report("Having a backing file for the target makes no sense when 
"

Re: [PATCH v13 03/10] virtio-iommu: Implement attach/detach command

2020-02-03 Thread Peter Xu

On Mon, Feb 03, 2020 at 06:46:36PM +0100, Auger Eric wrote:
> Hi Peter,
> 
> On 2/3/20 4:19 PM, Peter Xu wrote:
> > On Mon, Feb 03, 2020 at 03:59:00PM +0100, Auger Eric wrote:
> > 
> > [...]
> > 
>  +static void 
>  virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint *ep)
>  +{
>  +QLIST_REMOVE(ep, next);
>  +g_tree_unref(ep->domain->mappings);
> >>>
> >>> Here domain->mapping is unreferenced for each endpoint, while at [1]
> >>> below you only reference the domain->mappings if it's the first
> >>> endpoint.  Is that problematic?
> >> in [1] I take a ref to the domain->mappings if it is *not* the 1st
> >> endpoint. This aims at deleting the mappings gtree when the last EP is
> >> detached from the domain.
> >>
> >> This fixes the issue reported by Jean in:
> >> https://patchwork.kernel.org/patch/11258267/#23046313
> > 
> > Ah OK. :)
> > 
> > However this is tricky.  How about do explicit g_tree_destroy() in
> > virtio_iommu_detach() when it's the last endpoint?  I see that you
> > have:
> > 
> > /*
> >  * when the last EP is detached, simply remove the domain for
> >  * the domain list and destroy it. Note its mappings were already
> >  * freed by the ref count mechanism. Next operation involving
> >  * the same domain id will re-create one domain object.
> >  */
> > if (QLIST_EMPTY(&domain->endpoint_list)) {
> > g_tree_remove(s->domains, GUINT_TO_POINTER(domain->id));
> > }
> > 
> > Then it becomes:
> > 
> > if (QLIST_EMPTY(&domain->endpoint_list)) {
> > g_tree_destroy(domain->mappings);
> > g_tree_remove(s->domains, GUINT_TO_POINTER(domain->id));
> > }
> > 
> > And also remove the trick in attach() so you take the domain ref
> > unconditionally.  Would that work?
> Yes I think so. On the other hand this ref counting mechanism is also
> made for that purpose of destroying objects without being forced to
> explicitly call the destroy function.

IMHO that's two different things.  g_tree_destroy() should be the same
as g_tree_unref() here when the tree is empty.  It's really a matter
of easy reading of code:

void
g_tree_destroy (GTree *tree)
{
  g_return_if_fail (tree != NULL);

  g_tree_remove_all (tree);
  g_tree_unref (tree);
}

What we really changed here is to allow the ref/unref to be clearly
paired, i.e., for each EP it'll ref once and unref once.  The prvious
solution has the trick in that the 1st EP don't ref, the latter EPs
ref, and when the domain quits it doesn't unref to match the first
ref.  It's error prone to me.  Then, if we can do it in the paired way
easily, I don't see why not...

Thanks,

-- 
Peter Xu

[PATCH 3/3] aspeed/smc: Fix number of dummy cycles for FAST_READ_4 command

2020-02-03 Thread Guenter Roeck

The Linux kernel recently started using FAST_READ_4 commands.
This results in flash read failures. At the same time, the m25p80
emulation is seen to read 8 more bytes than expected. Adjusting the
expected number of dummy cycles to match FAST_READ fixes the problem.

Signed-off-by: Guenter Roeck 
---
 hw/ssi/aspeed_smc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c
index f0c7bbbad3..61e8fa57d3 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -762,11 +762,11 @@ static int aspeed_smc_num_dummies(uint8_t command)
 case FAST_READ:
 case DOR:
 case QOR:
+case FAST_READ_4:
 case DOR_4:
 case QOR_4:
 return 1;
 case DIOR:
-case FAST_READ_4:
 case DIOR_4:
 return 2;
 case QIOR:
-- 
2.17.1

Re: [PATCH v2 2/2] bcm2835_dma: Re-initialize xlen in TD mode

2020-02-03 Thread Rene Stange

Philippe, of course you are right. I understand, what you mean. I'm a non-native
English speaker and I'm still learning. :)

Yes, I agree. Peter, please make the change, if you agree with the patch.

Thanks,

Rene


On Monday, 3 February 2020, 17:27:08 CET, Philippe Mathieu-Daudé 
 wrote:
> On 2/3/20 4:40 PM, Rene Stange wrote:
> > TD (two dimensions) DMA mode did not work, because the xlen variable
> > has not been re-initialized before each additional ylen run through
> > in bcm2835_dma_update(), which has been fixed.
> 
> "which has been fixed" confused me, because this current patch is fixing 
> it. Using present tense makes it easier to understand for non-native 
> English speakers IMHO:
> 
>TD (two dimensions) DMA mode does not work, because the xlen
>variable is not re-initialized before each additional ylen
>run through in bcm2835_dma_update(). Fix it.
> 
> If you agree, maybe Peter (the maintainer who will take your patch) can 
> make the change for you.
> 
> Reviewed-by: Philippe Mathieu-Daudé 
> 
> > 
> > Signed-off-by: Rene Stange 
> > ---
> >   hw/dma/bcm2835_dma.c | 4 +++-
> >   1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/dma/bcm2835_dma.c b/hw/dma/bcm2835_dma.c
> > index 667d951a6f..ccff5ed55b 100644
> > --- a/hw/dma/bcm2835_dma.c
> > +++ b/hw/dma/bcm2835_dma.c
> > @@ -54,7 +54,7 @@
> >   static void bcm2835_dma_update(BCM2835DMAState *s, unsigned c)
> >   {
> >   BCM2835DMAChan *ch = &s->chan[c];
> > -uint32_t data, xlen, ylen;
> > +uint32_t data, xlen, xlen_td, ylen;
> >   int16_t dst_stride, src_stride;
> >   
> >   if (!(s->enable & (1 << c))) {
> > @@ -82,6 +82,7 @@ static void bcm2835_dma_update(BCM2835DMAState *s, 
> > unsigned c)
> >   dst_stride = 0;
> >   src_stride = 0;
> >   }
> > +xlen_td = xlen;
> >   
> >   while (ylen != 0) {
> >   /* Normal transfer mode */
> > @@ -117,6 +118,7 @@ static void bcm2835_dma_update(BCM2835DMAState *s, 
> > unsigned c)
> >   if (--ylen != 0) {
> >   ch->source_ad += src_stride;
> >   ch->dest_ad += dst_stride;
> > +xlen = xlen_td;
> >   }
> >   }
> >   ch->cs |= BCM2708_DMA_END;
> > 
> 
> 
>

Re: [PATCH 2/2] configure: Check that sphinx-build is using Python 3

2020-02-03 Thread Alex Bennée



Peter Maydell  writes:

> Currently configure's has_sphinx_build() check simply runs a dummy
> sphinx-build and either passes or fails.  This means that "no
> sphinx-build at all" and "sphinx-build exists but is too old" are
> both reported the same way.
>
> Further, we want to assume that all the Python we write is running
> with at least Python 3.5; configure checks that for our scripts, but
> Sphinx extensions run with whatever Python version sphinx-build
> itself is using.
>
> Add a check to our conf.py which makes sphinx-build fail if it would
> be running our extensions with an old Python, and handle this
> in configure so we can report failure helpfully to the user.
> This will mean that configure --enable-docs will fail like this
> if the sphinx-build provided is not suitable:
>
> Warning: sphinx-build exists but it is either too old or uses too old a 
> Python version
>
> ERROR: User requested feature docs
>configure was not able to find it.
>Install texinfo, Perl/perl-podlators and a Python 3 version of 
> python-sphinx
>
> (As usual, the default is to simply not build the docs, as we would
> if sphinx-build wasn't present at all.)
>
> Signed-off-by: Peter Maydell 

Reviewed-by: Alex Bennée 

> ---
> At the moment our Sphinx extensions all work under Python 2;
> but the one for handling parsing QAPI docs out of the JSON is going
> to want to include some of the scripts/qapi Python which is more
> complicated and definitely now 3-only.  In any case it's nicer to
> fail cleanly rather than let users stumble into corner cases we don't
> test and don't want to support even if they happen to work today.
> ---
>  configure| 12 ++--
>  docs/conf.py | 10 ++
>  2 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/configure b/configure
> index 830f325822a..95055f2e9dd 100755
> --- a/configure
> +++ b/configure
> @@ -4808,11 +4808,19 @@ has_sphinx_build() {
>  
>  # Check if tools are available to build documentation.
>  if test "$docs" != "no" ; then
> -  if has makeinfo && has pod2man && has_sphinx_build; then
> +  if has_sphinx_build; then
> +sphinx_ok=yes
> +  else
> +sphinx_ok=no
> +  fi
> +  if has makeinfo && has pod2man && test "$sphinx_ok" = "yes"; then
>  docs=yes
>else
>  if test "$docs" = "yes" ; then
> -  feature_not_found "docs" "Install texinfo, Perl/perl-podlators and 
> python-sphinx"
> +  if has $sphinx_build && test "$sphinx_ok" != "yes"; then
> +echo "Warning: $sphinx_build exists but it is either too old or uses 
> too old a Python version" >&2
> +  fi
> +  feature_not_found "docs" "Install texinfo, Perl/perl-podlators and a 
> Python 3 version of python-sphinx"
>  fi
>  docs=no
>fi
> diff --git a/docs/conf.py b/docs/conf.py
> index ee7faa6b4e7..7588bf192ee 100644
> --- a/docs/conf.py
> +++ b/docs/conf.py
> @@ -28,6 +28,16 @@
>  
>  import os
>  import sys
> +import sphinx
> +from sphinx.errors import VersionRequirementError
> +
> +# Make Sphinx fail cleanly if using an old Python, rather than obscurely
> +# failing because some code in one of our extensions doesn't work there.
> +# Unfortunately this doesn't display very neatly (there's an unavoidable
> +# Python backtrace) but at least the information gets printed...
> +if sys.version_info < (3,5):
> +raise VersionRequirementError(
> +"QEMU requires a Sphinx that uses Python 3.5 or better\n")
>  
>  # The per-manual conf.py will set qemu_docdir for a single-manual build;
>  # otherwise set it here if this is an entire-manual-set build.


-- 
Alex Bennée

[PATCH 1/3] m25p80: Convert to support tracing

2020-02-03 Thread Guenter Roeck

While at it, add some trace messages to help debug problems
seen when running the latest Linux kernel.

Signed-off-by: Guenter Roeck 
---
 hw/block/m25p80.c | 48 ---
 hw/block/trace-events | 16 +++
 2 files changed, 38 insertions(+), 26 deletions(-)

diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 11ff5b9ad7..63e050d7d3 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -32,17 +32,7 @@
 #include "qemu/module.h"
 #include "qemu/error-report.h"
 #include "qapi/error.h"
-
-#ifndef M25P80_ERR_DEBUG
-#define M25P80_ERR_DEBUG 0
-#endif
-
-#define DB_PRINT_L(level, ...) do { \
-if (M25P80_ERR_DEBUG > (level)) { \
-fprintf(stderr,  ": %s: ", __func__); \
-fprintf(stderr, ## __VA_ARGS__); \
-} \
-} while (0)
+#include "trace.h"
 
 /* Fields for FlashPartInfo->flags */
 
@@ -574,7 +564,8 @@ static void flash_erase(Flash *s, int offset, FlashCMD cmd)
 abort();
 }
 
-DB_PRINT_L(0, "offset = %#x, len = %d\n", offset, len);
+trace_m25p80_flash_erase(offset, len);
+
 if ((s->pi->flags & capa_to_assert) != capa_to_assert) {
 qemu_log_mask(LOG_GUEST_ERROR, "M25P80: %d erase size not supported by"
   " device\n", len);
@@ -607,8 +598,7 @@ void flash_write8(Flash *s, uint32_t addr, uint8_t data)
 }
 
 if ((prev ^ data) & data) {
-DB_PRINT_L(1, "programming zero to one! addr=%" PRIx32 "  %" PRIx8
-   " -> %" PRIx8 "\n", addr, prev, data);
+trace_m25p80_programming_zero_to_one(addr, prev, data);
 }
 
 if (s->pi->flags & EEPROM) {
@@ -662,6 +652,9 @@ static void complete_collecting_data(Flash *s)
 
 s->state = STATE_IDLE;
 
+trace_m25p80_complete_collecting(s->cmd_in_progress, n, s->ear,
+ s->cur_addr);
+
 switch (s->cmd_in_progress) {
 case DPP:
 case QPP:
@@ -825,7 +818,7 @@ static void reset_memory(Flash *s)
 break;
 }
 
-DB_PRINT_L(0, "Reset done.\n");
+trace_m25p80_reset_done();
 }
 
 static void decode_fast_read_cmd(Flash *s)
@@ -941,9 +934,10 @@ static void decode_qio_read_cmd(Flash *s)
 
 static void decode_new_cmd(Flash *s, uint32_t value)
 {
-s->cmd_in_progress = value;
 int i;
-DB_PRINT_L(0, "decoded new command:%x\n", value);
+
+s->cmd_in_progress = value;
+trace_m25p80_command_decoded(value);
 
 if (value != RESET_MEMORY) {
 s->reset_enable = false;
@@ -1042,7 +1036,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
 break;
 
 case JEDEC_READ:
-DB_PRINT_L(0, "populated jedec code\n");
+trace_m25p80_populated_jedec();
 for (i = 0; i < s->pi->id_len; i++) {
 s->data[i] = s->pi->id[i];
 }
@@ -1063,7 +1057,7 @@ static void decode_new_cmd(Flash *s, uint32_t value)
 case BULK_ERASE_60:
 case BULK_ERASE:
 if (s->write_enable) {
-DB_PRINT_L(0, "chip erase\n");
+trace_m25p80_chip_erase();
 flash_erase(s, 0, BULK_ERASE);
 } else {
 qemu_log_mask(LOG_GUEST_ERROR, "M25P80: chip erase with write "
@@ -1184,7 +1178,7 @@ static int m25p80_cs(SSISlave *ss, bool select)
 s->data_read_loop = false;
 }
 
-DB_PRINT_L(0, "%sselect\n", select ? "de" : "");
+trace_m25p80_select(select ? "de" : "");
 
 return 0;
 }
@@ -1194,19 +1188,20 @@ static uint32_t m25p80_transfer8(SSISlave *ss, uint32_t 
tx)
 Flash *s = M25P80(ss);
 uint32_t r = 0;
 
+trace_m25p80_transfer(s->state, s->len, s->needed_bytes, s->pos,
+  s->cur_addr, (uint8_t)tx);
+
 switch (s->state) {
 
 case STATE_PAGE_PROGRAM:
-DB_PRINT_L(1, "page program cur_addr=%#" PRIx32 " data=%" PRIx8 "\n",
-   s->cur_addr, (uint8_t)tx);
+trace_m25p80_page_program(s->cur_addr, (uint8_t)tx);
 flash_write8(s, s->cur_addr, (uint8_t)tx);
 s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
 break;
 
 case STATE_READ:
 r = s->storage[s->cur_addr];
-DB_PRINT_L(1, "READ 0x%" PRIx32 "=%" PRIx8 "\n", s->cur_addr,
-   (uint8_t)r);
+trace_m25p80_read_byte(s->cur_addr, (uint8_t)r);
 s->cur_addr = (s->cur_addr + 1) & (s->size - 1);
 break;
 
@@ -1244,6 +1239,7 @@ static uint32_t m25p80_transfer8(SSISlave *ss, uint32_t 
tx)
 }
 
 r = s->data[s->pos];
+trace_m25p80_read_data(s->pos, (uint8_t)r);
 s->pos++;
 if (s->pos == s->len) {
 s->pos = 0;
@@ -1281,7 +1277,7 @@ static void m25p80_realize(SSISlave *ss, Error **errp)
 return;
 }
 
-DB_PRINT_L(0, "Binding to IF_MTD drive\n");
+trace_m25p80_binding();
 s->storage = blk_blockalign(s->blk, s->size);
 
 if (blk_pread(s->blk, 0, s->storage, s->size) != s->size) {
@@ -1289,7 +1285,7 @@ static void m25p80_realize(SSISlave *ss, Error **errp)

[PATCH 2/3] m25p80: Improve command handling for Jedec and unsupported commands

2020-02-03 Thread Guenter Roeck

Always report 6 bytes of JEDEC data. Fill remaining data with 0.

For unsupported commands, keep sending a value of 0 until the chip
is deselected.

Both changes avoid attempts to decode random commands. Up to now this
happened if the reported Jedec data was shorter than 6 bytes but the
host read 6 bytes, and with all unsupported commands.

Signed-off-by: Guenter Roeck 
---
 hw/block/m25p80.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
index 63e050d7d3..aca75edcc1 100644
--- a/hw/block/m25p80.c
+++ b/hw/block/m25p80.c
@@ -1040,8 +1040,11 @@ static void decode_new_cmd(Flash *s, uint32_t value)
 for (i = 0; i < s->pi->id_len; i++) {
 s->data[i] = s->pi->id[i];
 }
+for (; i < SPI_NOR_MAX_ID_LEN; i++) {
+s->data[i] = 0;
+}
 
-s->len = s->pi->id_len;
+s->len = SPI_NOR_MAX_ID_LEN;
 s->pos = 0;
 s->state = STATE_READING_DATA;
 break;
@@ -1158,6 +1161,11 @@ static void decode_new_cmd(Flash *s, uint32_t value)
 s->quad_enable = false;
 break;
 default:
+s->pos = 0;
+s->len = 1;
+s->state = STATE_READING_DATA;
+s->data_read_loop = true;
+s->data[0] = 0;
 qemu_log_mask(LOG_GUEST_ERROR, "M25P80: Unknown cmd %x\n", value);
 break;
 }
-- 
2.17.1

Re: [PATCH 1/2] configure: Allow user to specify sphinx-build binary

2020-02-03 Thread Alex Bennée



Peter Maydell  writes:

> Currently we insist on using 'sphinx-build' from the $PATH;
> allow the user to specify the binary to use. This will be
> more useful as we become pickier about the capabilities
> we require (eg needing a Python 3 sphinx-build).
>
> Signed-off-by: Peter Maydell 

Reviewed-by: Alex Bennée 

> ---
> I went with the most common convention for specifying "here's
> an executable", like --make=, --install=, --python=
>
> The only odd one out for our current configure options seems to be
> that we want --with-git=GIT, not --git=GIT. You could argue that
> that's a better convention, but it makes more sense to me to
> stick with the convention we currently mostly have. (Perhaps
> we should even change --with-git= to --git= ?)
>
>  configure | 10 +-
>  Makefile  |  2 +-
>  2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/configure b/configure
> index 5095f017283..830f325822a 100755
> --- a/configure
> +++ b/configure
> @@ -584,6 +584,7 @@ query_pkg_config() {
>  }
>  pkg_config=query_pkg_config
>  sdl2_config="${SDL2_CONFIG-${cross_prefix}sdl2-config}"
> +sphinx_build=sphinx-build
>  
>  # If the user hasn't specified ARFLAGS, default to 'rv', just as make does.
>  ARFLAGS="${ARFLAGS-rv}"
> @@ -975,6 +976,8 @@ for opt do
>;;
>--python=*) python="$optarg"
>;;
> +  --sphinx-build=*) sphinx_build="$optarg"
> +  ;;
>--gcov=*) gcov_tool="$optarg"
>;;
>--smbd=*) smbd="$optarg"
> @@ -1677,6 +1680,7 @@ Advanced options (experts only):
>--make=MAKE  use specified make [$make]
>--install=INSTALLuse specified install [$install]
>--python=PYTHON  use specified python [$python]
> +  --sphinx-build=SPHINXuse specified sphinx-build [$sphinx_build]
>--smbd=SMBD  use specified smbd [$smbd]
>--with-git=GIT   use specified git [$git]
>--static enable static build [$static]
> @@ -4799,7 +4803,7 @@ has_sphinx_build() {
>  # sphinx-build doesn't exist at all or if it is too old.
>  mkdir -p "$TMPDIR1/sphinx"
>  touch "$TMPDIR1/sphinx/index.rst"
> -sphinx-build -c "$source_path/docs" -b html "$TMPDIR1/sphinx" 
> "$TMPDIR1/sphinx/out" >/dev/null 2>&1
> +$sphinx_build -c "$source_path/docs" -b html "$TMPDIR1/sphinx" 
> "$TMPDIR1/sphinx/out" >/dev/null 2>&1
>  }
>  
>  # Check if tools are available to build documentation.
> @@ -6474,6 +6478,9 @@ echo "QEMU_LDFLAGS  $QEMU_LDFLAGS"
>  echo "make  $make"
>  echo "install   $install"
>  echo "python$python ($python_version)"
> +if test "$docs" != "no"; then
> +echo "sphinx-build  $sphinx_build"
> +fi
>  echo "slirp support $slirp $(echo_version $slirp $slirp_version)"
>  if test "$slirp" != "no" ; then
>  echo "smbd  $smbd"
> @@ -7503,6 +7510,7 @@ echo "INSTALL_DATA=$install -c -m 0644" >> 
> $config_host_mak
>  echo "INSTALL_PROG=$install -c -m 0755" >> $config_host_mak
>  echo "INSTALL_LIB=$install -c -m 0644" >> $config_host_mak
>  echo "PYTHON=$python" >> $config_host_mak
> +echo "SPHINX_BUILD=$sphinx_build" >> $config_host_mak
>  echo "CC=$cc" >> $config_host_mak
>  if $iasl -h > /dev/null 2>&1; then
>echo "IASL=$iasl" >> $config_host_mak
> diff --git a/Makefile b/Makefile
> index a6f5d440828..1f37523b528 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1024,7 +1024,7 @@ sphinxdocs: $(MANUAL_BUILDDIR)/devel/index.html \
>  # Note the use of different doctree for each (manual, builder) tuple;
>  # this works around Sphinx not handling parallel invocation on
>  # a single doctree: https://github.com/sphinx-doc/sphinx/issues/2946
> -build-manual = $(call quiet-command,CONFDIR="$(qemu_confdir)" sphinx-build 
> $(if $(V),,-q) -W -b $2 -D version=$(VERSION) -D release="$(FULL_VERSION)" -d 
> .doctrees/$1-$2 $(SRC_PATH)/docs/$1 $(MANUAL_BUILDDIR)/$1 
> ,"SPHINX","$(MANUAL_BUILDDIR)/$1")
> +build-manual = $(call quiet-command,CONFDIR="$(qemu_confdir)" 
> $(SPHINX_BUILD) $(if $(V),,-q) -W -b $2 -D version=$(VERSION) -D 
> release="$(FULL_VERSION)" -d .doctrees/$1-$2 $(SRC_PATH)/docs/$1 
> $(MANUAL_BUILDDIR)/$1 ,"SPHINX","$(MANUAL_BUILDDIR)/$1")
>  # We assume all RST files in the manual's directory are used in it
>  manual-deps = $(wildcard $(SRC_PATH)/docs/$1/*.rst) \
>$(wildcard $(SRC_PATH)/docs/$1/*.rst.inc) \


-- 
Alex Bennée

Re: [PATCH v6 00/41] target/arm: Implement ARMv8.1-VHE

2020-02-03 Thread Alex Bennée



Richard Henderson  writes:

> Version 6 moves vhe_reginfo[] to file scope, and one tweak
> to the vhe register access masking that Peter asked for.
>
> All patches now have reviews.

I was re-testing and I was able to boot my guest Image+buildroot.
However the busybox crashes after login so I'm unable to do stuff in the
guests userspace. I seem to recall we saw this last time but I can't
remember what the problem was.

>
>
> r~
>
>
> Alex Bennée (1):
>   target/arm: check TGE and E2H flags for EL0 pauth traps
>
> Richard Henderson (40):
>   target/arm: Define isar_feature_aa64_vh
>   target/arm: Enable HCR_E2H for VHE
>   target/arm: Add CONTEXTIDR_EL2
>   target/arm: Add TTBR1_EL2
>   target/arm: Update CNTVCT_EL0 for VHE
>   target/arm: Split out vae1_tlbmask
>   target/arm: Split out alle1_tlbmask
>   target/arm: Simplify tlb_force_broadcast alternatives
>   target/arm: Rename ARMMMUIdx*_S12NSE* to ARMMMUIdx*_E10_*
>   target/arm: Rename ARMMMUIdx_S2NS to ARMMMUIdx_Stage2
>   target/arm: Rename ARMMMUIdx_S1NSE* to ARMMMUIdx_Stage1_E*
>   target/arm: Rename ARMMMUIdx_S1SE[01] to ARMMMUIdx_SE10_[01]
>   target/arm: Rename ARMMMUIdx*_S1E3 to ARMMMUIdx*_SE3
>   target/arm: Rename ARMMMUIdx_S1E2 to ARMMMUIdx_E2
>   target/arm: Recover 4 bits from TBFLAGs
>   target/arm: Expand TBFLAG_ANY.MMUIDX to 4 bits
>   target/arm: Rearrange ARMMMUIdxBit
>   target/arm: Tidy ARMMMUIdx m-profile definitions
>   target/arm: Reorganize ARMMMUIdx
>   target/arm: Add regime_has_2_ranges
>   target/arm: Update arm_mmu_idx for VHE
>   target/arm: Update arm_sctlr for VHE
>   target/arm: Update aa64_zva_access for EL2
>   target/arm: Update ctr_el0_access for EL2
>   target/arm: Add the hypervisor virtual counter
>   target/arm: Update timer access for VHE
>   target/arm: Update define_one_arm_cp_reg_with_opaque for VHE
>   target/arm: Add VHE system register redirection and aliasing
>   target/arm: Add VHE timer register redirection and aliasing
>   target/arm: Flush tlb for ASID changes in EL2&0 translation regime
>   target/arm: Flush tlbs for E2&0 translation regime
>   target/arm: Update arm_phys_excp_target_el for TGE
>   target/arm: Update {fp,sve}_exception_el for VHE
>   target/arm: Update get_a64_user_mem_index for VHE
>   target/arm: Update arm_cpu_do_interrupt_aarch64 for VHE
>   target/arm: Enable ARMv8.1-VHE in -cpu max
>   target/arm: Move arm_excp_unmasked to cpu.c
>   target/arm: Pass more cpu state to arm_excp_unmasked
>   target/arm: Use bool for unmasked in arm_excp_unmasked
>   target/arm: Raise only one interrupt in arm_cpu_exec_interrupt
>
>  target/arm/cpu-param.h |2 +-
>  target/arm/cpu-qom.h   |1 +
>  target/arm/cpu.h   |  423 +
>  target/arm/internals.h |   73 ++-
>  target/arm/translate.h |4 +-
>  target/arm/cpu.c   |  162 -
>  target/arm/cpu64.c |1 +
>  target/arm/debug_helper.c  |   50 +-
>  target/arm/helper-a64.c|2 +-
>  target/arm/helper.c| 1220 +++-
>  target/arm/pauth_helper.c  |   14 +-
>  target/arm/translate-a64.c |   47 +-
>  target/arm/translate.c |   74 ++-
>  13 files changed, 1392 insertions(+), 681 deletions(-)


--
Alex Bennée

Re: [PATCH v13 03/10] virtio-iommu: Implement attach/detach command

2020-02-03 Thread Auger Eric

Hi Peter,

On 2/3/20 4:19 PM, Peter Xu wrote:
> On Mon, Feb 03, 2020 at 03:59:00PM +0100, Auger Eric wrote:
> 
> [...]
> 
 +static void virtio_iommu_detach_endpoint_from_domain(VirtIOIOMMUEndpoint 
 *ep)
 +{
 +QLIST_REMOVE(ep, next);
 +g_tree_unref(ep->domain->mappings);
>>>
>>> Here domain->mapping is unreferenced for each endpoint, while at [1]
>>> below you only reference the domain->mappings if it's the first
>>> endpoint.  Is that problematic?
>> in [1] I take a ref to the domain->mappings if it is *not* the 1st
>> endpoint. This aims at deleting the mappings gtree when the last EP is
>> detached from the domain.
>>
>> This fixes the issue reported by Jean in:
>> https://patchwork.kernel.org/patch/11258267/#23046313
> 
> Ah OK. :)
> 
> However this is tricky.  How about do explicit g_tree_destroy() in
> virtio_iommu_detach() when it's the last endpoint?  I see that you
> have:
> 
> /*
>  * when the last EP is detached, simply remove the domain for
>  * the domain list and destroy it. Note its mappings were already
>  * freed by the ref count mechanism. Next operation involving
>  * the same domain id will re-create one domain object.
>  */
> if (QLIST_EMPTY(&domain->endpoint_list)) {
> g_tree_remove(s->domains, GUINT_TO_POINTER(domain->id));
> }
> 
> Then it becomes:
> 
> if (QLIST_EMPTY(&domain->endpoint_list)) {
> g_tree_destroy(domain->mappings);
> g_tree_remove(s->domains, GUINT_TO_POINTER(domain->id));
> }
> 
> And also remove the trick in attach() so you take the domain ref
> unconditionally.  Would that work?
Yes I think so. On the other hand this ref counting mechanism is also
made for that purpose of destroying objects without being forced to
explicitly call the destroy function.

Thanks

Eric
> 
> Thanks,
>

Re: [PATCH 13/17] qcow2: Add new autoclear feature for all zero image

2020-02-03 Thread Vladimir Sementsov-Ogievskiy


31.01.2020 20:44, Eric Blake wrote:

With the recent introduction of BDRV_ZERO_OPEN, we can optimize
various qemu-img operations if we know the destination starts life
with all zero content.  For an image with no cluster allocations and
no backing file, this was already trivial with BDRV_ZERO_CREATE; but
for a fully preallocated image, it does not scale to crawl through the
entire L1/L2 tree to see if every cluster is currently marked as a
zero cluster.  But it is quite easy to add an autoclear bit to the
qcow2 file itself: the bit will be set after newly creating an image
or after qcow2_make_empty, and cleared on any other modification
(including by an older qemu that doesn't recognize the bit).

This patch documents the new bit, independently of implementing the
places in code that should set it (which means that for bisection
purposes, it is safer to still mask the bit out when opening an image
with the bit set).

A few iotests have updated output due to the larger number of named
header features.

Signed-off-by: Eric Blake 

---
RFC: As defined in this patch, I defined the bit to be clear if any
cluster defers to a backing file. But the block layer would handle
things just fine if we instead allowed the bit to be set if all
clusters allocated in this image are zero, even if there are other
clusters not allocated.  Or maybe we want TWO bits: one if all
clusters allocated here are known zero, and a second if we know that
there are any clusters that defer to a backing image.
---
  block/qcow2.c  |  9 +
  block/qcow2.h  |  3 +++
  docs/interop/qcow2.txt | 12 +++-
  qapi/block-core.json   |  4 
  tests/qemu-iotests/031.out |  8 
  tests/qemu-iotests/036.out |  4 ++--
  tests/qemu-iotests/061.out | 14 +++---
  7 files changed, 40 insertions(+), 14 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 9f2371925737..20cce9410c84 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -2859,6 +2859,11 @@ int qcow2_update_header(BlockDriverState *bs)
  .bit  = QCOW2_AUTOCLEAR_DATA_FILE_RAW_BITNR,
  .name = "raw external data",
  },
+{
+.type = QCOW2_FEAT_TYPE_AUTOCLEAR,
+.bit  = QCOW2_AUTOCLEAR_ALL_ZERO_BITNR,
+.name = "all zero",
+},
  };

  ret = header_ext_add(buf, QCOW2_EXT_MAGIC_FEATURE_TABLE,
@@ -4874,6 +4879,10 @@ static ImageInfoSpecific 
*qcow2_get_specific_info(BlockDriverState *bs,
  .corrupt= s->incompatible_features &
QCOW2_INCOMPAT_CORRUPT,
  .has_corrupt= true,
+.all_zero   = s->autoclear_features &
+  QCOW2_AUTOCLEAR_ALL_ZERO,
+.has_all_zero   = s->autoclear_features &
+  QCOW2_AUTOCLEAR_ALL_ZERO,
  .refcount_bits  = s->refcount_bits,
  .has_bitmaps= !!bitmaps,
  .bitmaps= bitmaps,
diff --git a/block/qcow2.h b/block/qcow2.h
index 094212623257..6fc2d323d753 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -237,11 +237,14 @@ enum {
  enum {
  QCOW2_AUTOCLEAR_BITMAPS_BITNR   = 0,
  QCOW2_AUTOCLEAR_DATA_FILE_RAW_BITNR = 1,
+QCOW2_AUTOCLEAR_ALL_ZERO_BITNR  = 2,
  QCOW2_AUTOCLEAR_BITMAPS = 1 << QCOW2_AUTOCLEAR_BITMAPS_BITNR,
  QCOW2_AUTOCLEAR_DATA_FILE_RAW   = 1 << 
QCOW2_AUTOCLEAR_DATA_FILE_RAW_BITNR,
+QCOW2_AUTOCLEAR_ALL_ZERO= 1 << QCOW2_AUTOCLEAR_ALL_ZERO_BITNR,

  QCOW2_AUTOCLEAR_MASK= QCOW2_AUTOCLEAR_BITMAPS
  | QCOW2_AUTOCLEAR_DATA_FILE_RAW,
+/* TODO: Add _ALL_ZERO to _MASK once it is handled correctly */
  };

  enum qcow2_discard_type {
diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
index 8510d74c8079..d435363a413c 100644
--- a/docs/interop/qcow2.txt
+++ b/docs/interop/qcow2.txt
@@ -153,7 +153,17 @@ in the description of a field.
  File bit (incompatible feature bit 1) is also
  set.

-Bits 2-63:  Reserved (set to 0)
+Bit 2:  All zero image bit
+If this bit is set, the entire image reads
+as all zeroes. This can be useful for
+detecting just-created images even when
+clusters are preallocated, which in turn
+can be used to optimize image copying.
+
+This bit should not be set if any cluster
+in the image defers to a backing file.


Hmm. The term "defers to a backing file" not defined in the spec. And, as I
understand, can't be defined by design. Backing file may be 
added/removed/changed
dynamical

[PATCH v2 2/2] bcm2835_dma: Re-initialize xlen in TD mode

2020-02-03 Thread Rene Stange

TD (two dimensions) DMA mode did not work, because the xlen variable
has not been re-initialized before each additional ylen run through
in bcm2835_dma_update(), which has been fixed.

Signed-off-by: Rene Stange 
---
 hw/dma/bcm2835_dma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/dma/bcm2835_dma.c b/hw/dma/bcm2835_dma.c
index 667d951a6f..ccff5ed55b 100644
--- a/hw/dma/bcm2835_dma.c
+++ b/hw/dma/bcm2835_dma.c
@@ -54,7 +54,7 @@
 static void bcm2835_dma_update(BCM2835DMAState *s, unsigned c)
 {
 BCM2835DMAChan *ch = &s->chan[c];
-uint32_t data, xlen, ylen;
+uint32_t data, xlen, xlen_td, ylen;
 int16_t dst_stride, src_stride;
 
 if (!(s->enable & (1 << c))) {
@@ -82,6 +82,7 @@ static void bcm2835_dma_update(BCM2835DMAState *s, unsigned c)
 dst_stride = 0;
 src_stride = 0;
 }
+xlen_td = xlen;
 
 while (ylen != 0) {
 /* Normal transfer mode */
@@ -117,6 +118,7 @@ static void bcm2835_dma_update(BCM2835DMAState *s, unsigned 
c)
 if (--ylen != 0) {
 ch->source_ad += src_stride;
 ch->dest_ad += dst_stride;
+xlen = xlen_td;
 }
 }
 ch->cs |= BCM2708_DMA_END;
-- 
2.16.4

[PATCH] linux-user: implement TARGET_SO_PEERSEC

2020-02-03 Thread Laurent Vivier

"The purpose of this option is to allow an application to obtain the
security credentials of a Unix stream socket peer.  It is analogous to
SO_PEERCRED (which provides authentication using standard Unix credentials
of pid, uid and gid), and extends this concept to other security
models." -- https://lwn.net/Articles/62370/

Until now it was passed to the kernel with an "int" argument and
fails when it was supported by the host because the parameter is
like a filename: it is always a \0-terminated string with no embedded
\0 characters, but is not guaranteed to be ASCII or UTF-8.

I've tested the option with the following program:

/*
 * cc -o getpeercon getpeercon.c
 */

#include 
#include 
#include 
#include 
#include 

int main(void)
{
int fd;
struct sockaddr_in server, addr;
int ret;
socklen_t len;
char buf[256];

fd = socket(PF_INET, SOCK_STREAM, 0);
if (fd == -1) {
perror("socket");
return 1;
}

server.sin_family = AF_INET;
inet_aton("127.0.0.1", &server.sin_addr);
server.sin_port = htons(40390);

connect(fd, (struct sockaddr*)&server, sizeof(server));

len = sizeof(buf);
ret = getsockopt(fd, SOL_SOCKET, SO_PEERSEC, buf, &len);
if (ret == -1) {
perror("getsockopt");
return 1;
}
printf("%d %s\n", len, buf);
return 0;
}

On host:

  $ ./getpeercon
  33 system_u:object_r:unlabeled_t:s0

With qemu-aarch64/bionic without the patch:

  $ ./getpeercon
  getsockopt: Numerical result out of range

With the patch:

  $ ./getpeercon
  33 system_u:object_r:unlabeled_t:s0

Bug: https://bugs.launchpad.net/qemu/+bug/1823790
Reported-by: Matthias Lüscher 
Tested-by: Matthias Lüscher 
Signed-off-by: Laurent Vivier 
---
 linux-user/syscall.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index d60142f0691c..5f37e62772de 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -2344,6 +2344,28 @@ static abi_long do_getsockopt(int sockfd, int level, int 
optname,
 }
 break;
 }
+case TARGET_SO_PEERSEC: {
+char *name;
+
+if (get_user_u32(len, optlen)) {
+return -TARGET_EFAULT;
+}
+if (len < 0) {
+return -TARGET_EINVAL;
+}
+name = lock_user(VERIFY_WRITE, optval_addr, len, 0);
+if (!name) {
+return -TARGET_EFAULT;
+}
+lv = len;
+ret = get_errno(getsockopt(sockfd, level, SO_PEERSEC,
+   name, &lv));
+if (put_user_u32(lv, optlen)) {
+ret = -TARGET_EFAULT;
+}
+unlock_user(name, optval_addr, 0);
+break;
+}
 case TARGET_SO_LINGER:
 {
 struct linger lg;
-- 
2.24.1

Re: [RFC PATCH 2/2] GitLab CI: crude mapping of PMM's scripts to jobs

2020-02-03 Thread Wainer dos Santos Moschetta


Hi Cleber,

On 2/3/20 1:23 AM, Cleber Rosa wrote:

This is a crude and straightforward mapping of Peter's
"remake-merge-builds" and "pull-buildtest" scripts.

Some characteristics were removed for simplicity sake (but eventually
will), including:
  * number of simultaneous make jobs
  * make's synchronous output, not needed because of previous point
  * out-of-tree builds

This covers the "x86-64 Linux with a variety of different build
configs"[1].  I've personally tested all of them, and only had
issues with the "notcg" job[2], but it seems to be a test specific
issue with the nested KVM I was using.



Could you put a comment in the commit text or in-code explaining why it 
builds QEMU with  --disable-libssh on all the jobs?




[1] - https://wiki.qemu.org/Requirements/GatingCI#Current_Tests
[2] - https://paste.centos.org/view/1dd43a1c

Signed-off-by: Cleber Rosa 
---
  .gitlab-ci.yml | 116 +
  1 file changed, 116 insertions(+)

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index d2c7d2198e..eb4077e2ab 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -2,6 +2,8 @@ include:
- local: '/.gitlab-ci-edk2.yml'
  
  build-system1:

+ rules:
+ - if: '$CI_COMMIT_REF_NAME != "staging"'
   before_script: &before_scr_apt
   - apt-get update -qq
   - apt-get install -y -qq flex bison libglib2.0-dev libpixman-1-dev 
genisoimage
@@ -17,6 +19,8 @@ build-system1:
   - make -j2 check
  
  build-system2:

+ rules:
+ - if: '$CI_COMMIT_REF_NAME != "staging"'
   before_script:
*before_scr_apt
   script:
@@ -31,6 +35,8 @@ build-system2:
   - make -j2 check
  
  build-disabled:

+ rules:
+ - if: '$CI_COMMIT_REF_NAME != "staging"'
   before_script:
*before_scr_apt
   script:
@@ -47,6 +53,8 @@ build-disabled:
   - make -j2 check-qtest SPEED=slow
  
  build-tcg-disabled:

+ rules:
+ - if: '$CI_COMMIT_REF_NAME != "staging"'
   before_script:
*before_scr_apt
   script:
@@ -67,6 +75,8 @@ build-tcg-disabled:
  248 250 254 255 256
  
  build-user:

+ rules:
+ - if: '$CI_COMMIT_REF_NAME != "staging"'
   before_script:
*before_scr_apt
   script:
@@ -78,6 +88,8 @@ build-user:
   - make run-tcg-tests-i386-linux-user run-tcg-tests-x86_64-linux-user
  
  build-clang:

+ rules:
+ - if: '$CI_COMMIT_REF_NAME != "staging"'
   before_script:
*before_scr_apt
   script:
@@ -92,6 +104,8 @@ build-clang:
   - make -j2 check
  
  build-tci:

+ rules:
+ - if: '$CI_COMMIT_REF_NAME != "staging"'
   before_script:
*before_scr_apt
   script:
@@ -111,3 +125,105 @@ build-tci:
   - QTEST_QEMU_BINARY="x86_64-softmmu/qemu-system-x86_64" 
./tests/qtest/pxe-test
   - QTEST_QEMU_BINARY="s390x-softmmu/qemu-system-s390x"
 ./tests/qtest/pxe-test -m slow
+
+ubuntu-18.04.3-x86_64-notools:
+ tags:
+ - ubuntu_18.04.3
+ - x86_64
+ rules:
+ - if: '$CI_COMMIT_REF_NAME == "staging"'
+ script:
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/remake-merge-builds#n22
+ - ./configure --target-list=arm-softmmu --disable-tools --disable-libssh
+ # There is no make / make check in the "pull-buildtest" script for this.
+ # Question: should it at least be built? Or dropped?
+ - make
+
+ubuntu-18.04.3-x86_64-all-linux-static:


Doesn't it need to LD_PRELOAD on this runner too? ->

|https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/pull-buildtest#n24 
|



+ tags:
+ - ubuntu_18.04.3
+ - x86_64
+ rules:
+ - if: '$CI_COMMIT_REF_NAME == "staging"'
+ script:
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/remake-merge-builds#n25
+ - ./configure --enable-debug --static --disable-system --disable-glusterfs 
--disable-libssh


Shouldn't it --disable-gnutls instead of --disable-glusterfs?


+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/pull-buildtest#n36
+ - make
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/pull-buildtest#n45
+ - make check V=1
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/pull-buildtest#n48
+ - make check-tcg V=1



Any special reason to split it in 3 steps instead of a single `make 
check check-tcg`?


That pattern continues on next jobs...



+
+ubuntu-18.04.3-x86_64-all:
+ tags:
+ - ubuntu_18.04.3
+ - x86_64
+ rules:
+ - if: '$CI_COMMIT_REF_NAME == "staging"'
+ script:
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/remake-merge-builds#n26
+ - ./configure --disable-libssh
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/pull-buildtest#n28
+ - make
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/pull-buildtest#n37
+ - make check V=1
+
+ubuntu-18.04.3-x86_64-alldbg:
+ tags:
+ - ubuntu_18.04.3
+ - x86_64
+ rules:
+ - if: '$CI_COMMIT_REF_NAME == "staging"'
+ script:
+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts.git/tree/remake-merge-builds#n27
+ - ./configure --disable-libssh

Missing --enable-debug, right?

+ # 
https://git.linaro.org/people/peter.maydell/misc-scripts

Re: [PATCH 0/4] Improve default object property_add uint helpers

2020-02-03 Thread Felipe Franciosi



> On Feb 3, 2020, at 4:10 PM, Marc-André Lureau  
> wrote:
> 
> Hi
> 
> On Mon, Feb 3, 2020 at 5:08 PM Felipe Franciosi  wrote:
>> 
>> Oops, I completely forgot to add a "v5" on the subject line.
>> 
>> (The changelog is there.)
>> 
>> Let me know if I should resend.
>> 
>> F.
>> 
>>> On Feb 3, 2020, at 3:54 PM, Felipe Franciosi  wrote:
>>> 
>>> This improves the family of object_property_add_uintXX_ptr helpers by 
>>> enabling
>>> a default getter/setter only when desired. To prevent an API behavioural 
>>> change
>>> (from clients that already used these helpers and did not want a setter), we
>>> add a OBJ_PROP_FLAG_READ flag that allow clients to only have a getter. 
>>> Patch 1
>>> enhances the API and modify current users.
>>> 
>>> While modifying the clients of the API, a couple of improvement 
>>> opportunities
>>> were observed in ich9. These were added in separate patches (2 and 3).
>>> 
>>> Patch 4 cleans up a lot of existing code by moving various objects to the
>>> enhanced API. Previously, those objects had their own getters/setters that 
>>> only
>>> updated the values without further checks. Some of them actually lacked a 
>>> check
>>> for setting overflows, which could have resulted in undesired values being 
>>> set.
>>> The new default setters include a check for that, not updating the values in
>>> case of errors (and propagating them). If they did not provide an error
>>> pointer, then that behaviour was maintained.
>>> 
>>> Felipe Franciosi (4):
>>> qom/object: enable setter for uint types
>>> ich9: fix getter type for sci_int property
>>> ich9: Simplify ich9_lpc_initfn
>>> qom/object: Use common get/set uint helpers
>>> 
>>> hw/acpi/ich9.c   |  99 ++--
>>> hw/acpi/pcihp.c  |   7 +-
>>> hw/acpi/piix4.c  |  12 +--
>>> hw/isa/lpc_ich9.c|  27 ++
>>> hw/misc/edu.c|  13 +--
>>> hw/pci-host/q35.c|  14 +--
>>> hw/ppc/spapr.c   |  18 +---
>>> hw/ppc/spapr_drc.c   |   3 +-
>>> include/qom/object.h |  48 --
>>> memory.c |  15 +--
>>> qom/object.c | 214 ++-
>>> target/arm/cpu.c |  22 +
>>> target/i386/sev.c| 106 ++---
>>> ui/console.c |   4 +-
>>> 14 files changed, 283 insertions(+), 319 deletions(-)
>>> 
>>> --
>>> 2.20.1
>>> 
>>> Changelog:
>>> v1->v2:
>>> - Update sci_int directly instead of using stack variable
>>> - Defining an enhanced ObjectPropertyFlags instead of just 'readonly'
>>> - Erroring out directly (instead of using gotos) on default setters
>>> - Retaining lack of errp passing when it wasn't there
>>> v2->v3:
>>> - Rename flags _RD to _READ and _WR to _WRITE
>>> - Add a convenience _READWRITE flag
>>> - Drop the usage of UL in the bit flag definitions
>>> v3->v4:
>>> - Drop changes to hw/vfio/pci-quirks.c
>>> v4->v5:
>>> - Rebase on latest master
>>> - Available here: https://github.com/franciozzy/qemu/tree/autosetters
> 
> Thanks for the rebase, it looks good overall, but:
> 
> qom/object.c:2707:1: error: control reaches end of non-void function
> [-Werror=return-type]

That was an oversight. :)

Let me fix it and send a corrected (v6) subject line.

Felipe

> 
> 
> -- 
> Marc-André Lureau

Re: Need help understanding assertion fail.

2020-02-03 Thread Peter Maydell

On Mon, 3 Feb 2020 at 16:39, Wayne Li  wrote:
> Anyway that's the background.  The specific problem I'm having right now is I 
> get the following assertion error during some of the setup stuff our OS does 
> post boot-up (the OS is also custom-made):
>
> qemu_programs/qemu/tcg/ppc/tcg-target.inc.c:224: reloc_pc14_val: Assertion 
> `disp == (int16_t) disp' failed.
>
> Looking at the QEMU code, "disp" is the difference between two pointers named 
> "target" and "pc".  I'm not sure exactly what either of those names mean.  
> And it looks like since the assertion is checking if casting "disp" as a 
> short changes the value, it's checking if the "disp" value is too big?  I'm 
> just not very sure what this assertion means.

This assertion is checking that we're not trying to fit too
large a value into the host PPC branch instruction we just emitted.
That is, tcg_out_bc() emits a PPC conditional branch instruction,
which has a 14 bit field for the offset (it's a relative branch),
and we know the bottom 2 bits of the target will be 0 (PPC insns
being 4-aligned), so the distance between the current host PC
and the target of the branch must fit in a signed 16-bit field.

"disp" here stands for "displacement".

The PPC TCG backend only uses this for the TCG 'brcond' and
'brcond2' TCG intermediate-representation ops. It seems likely
that the code for your target is generating TCG ops which have
too large a gap between a brcond/brcond2 and the destination label.
You could try using the various QEMU -d options to print out the
guest instructions and the generated TCG ops to pin down what
part of your target is trying to generate branches over too
much code like this.

> Anyway, the thing is this problem has to be somehow related to
> the transfer of the code from a little-endian platform to a
> big-endian platform as our project works without any problem on
> little-endian platforms.

In this case it isn't necessarily directly an endianness issue.
The x86 instruction set provides conditional branch instructions
which allow a 32-bit displacement value, so you're basically never
going to overflow a conditional-branch there. PPC, being RISC,
has more limited branch insns. You might also run into this
if you tried to use aarch64 (64-bit) arm hosts, which are
little-endian but have a 19-bit branch displacement limit,
depending on just how big you've managed to make your jumps.
On the other hand, a 16-bit displacement is a jump over
64K of generated code, which is huge for a single TCG
generated translation block, so it could well be that you
have an endianness bug in your TCG frontend which is causing
you to generate an enormous TB by accident.

thanks
-- PMM

1 2 3 4 >

1 - 100 of 359 matches

Mail list logo