Re: [Qemu-devel] [RISU PATCH v3 10/18] x86.risu: add MMX instructions

2019-07-19 Thread Richard Henderson
On 7/11/19 3:32 PM, Jan Bobek wrote:
> Add an x86 configuration file with all MMX instructions.
> 
> Signed-off-by: Jan Bobek 
> ---
>  x86.risu | 321 +++
>  1 file changed, 321 insertions(+)
>  create mode 100644 x86.risu

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] [PATCH] migration: equation is more proper than and to check LOADVM_QUIT

2019-07-19 Thread Wei Yang
On Fri, Jul 19, 2019 at 07:41:28PM +0100, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> LOADVM_QUIT allows a command to quit all layers of nested loadvm loops,
>> while current return value check is not that proper even it works now.
>> 
>> Current return value check "ret & LOADVM_QUIT" would return true if
>> bit[0] is 1. This would be true when ret is -1 which is used to indicate
>> an error of handling a command.
>> 
>> Since there is only one place return LOADVM_QUIT and no other
>> combination of return value, use "ret == LOADVM_QUIT" would be more
>> proper.
>
>Yes, when I first wrote this it was more complex with a few flags, and
>they all got removed.
>

Ah, life is much easier now :-)

>> Signed-off-by: Wei Yang 
>
>Reviewed-by: Dr. David Alan Gilbert 
>
>> ---
>>  migration/savevm.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index 1a9b1f411e..25fe7ea05a 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -2429,7 +2429,7 @@ retry:
>>  case QEMU_VM_COMMAND:
>>  ret = loadvm_process_command(f);
>>  trace_qemu_loadvm_state_section_command(ret);
>> -if ((ret < 0) || (ret & LOADVM_QUIT)) {
>> +if ((ret < 0) || (ret == LOADVM_QUIT)) {
>>  goto out;
>>  }
>>  break;
>> -- 
>> 2.17.1
>> 
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH] migration/postcopy: use static PostcopyDiscardState instead of allocating it for each block

2019-07-19 Thread Wei Yang
On Fri, Jul 19, 2019 at 06:41:28PM +0100, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> Even we need to do discard for each RAMBlock, we still can leverage the
>> same memory space to store the information.
>> 
>> By doing so, we avoid memory allocation and deallocation to the system
>> and also avoid potential failure of memory allocation which breaks the
>> migration.
>> 
>> Signed-off-by: Wei Yang 
>> ---
>>  migration/postcopy-ram.c | 16 +++-
>>  1 file changed, 7 insertions(+), 9 deletions(-)
>> 
>> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
>> index 9faacacc9e..2e6b076bb7 100644
>> --- a/migration/postcopy-ram.c
>> +++ b/migration/postcopy-ram.c
>> @@ -1377,8 +1377,7 @@ void 
>> postcopy_fault_thread_notify(MigrationIncomingState *mis)
>>   *   asking to discard individual ranges.
>>   *
>>   * @ms: The current migration state.
>> - * @offset: the bitmap offset of the named RAMBlock in the migration
>> - *   bitmap.
>> + * @offset: the bitmap offset of the named RAMBlock in the migration bitmap.
>>   * @name: RAMBlock that discards will operate on.
>>   *
>>   * returns: a new PDS.
>> @@ -1386,13 +1385,14 @@ void 
>> postcopy_fault_thread_notify(MigrationIncomingState *mis)
>>  PostcopyDiscardState *postcopy_discard_send_init(MigrationState *ms,
>>   const char *name)
>>  {
>> -PostcopyDiscardState *res = g_malloc0(sizeof(PostcopyDiscardState));
>> +static PostcopyDiscardState res = {0};
>
>Do you think it would be better to make this a static at the top of
>migration/postcopy-ram.c and then we could remove the pds parameters
>from postcopy_discard_send_range and friends?
>If there's only one pds then we don't need to pass the pointer around.
>

It sounds better to me, let me prepare v2.

Thanks

>Dave
>
>> -if (res) {
>> -res->ramblock_name = name;
>> -}
>> +res.ramblock_name = name;
>> +res.cur_entry = 0;
>> +res.nsentwords = 0;
>> +res.nsentcmds = 0;
>>  
>> -return res;
>> +return 
>>  }
>>  
>>  /**
>> @@ -1449,8 +1449,6 @@ void postcopy_discard_send_finish(MigrationState *ms, 
>> PostcopyDiscardState *pds)
>>  
>>  trace_postcopy_discard_send_finish(pds->ramblock_name, pds->nsentwords,
>> pds->nsentcmds);
>> -
>> -g_free(pds);
>>  }
>>  
>>  /*
>> -- 
>> 2.17.1
>> 
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH] riscv: sifive_test: Add reset functionality

2019-07-19 Thread Palmer Dabbelt

On Fri, 14 Jun 2019 08:15:51 PDT (-0700), bmeng...@gmail.com wrote:

This adds a reset opcode for sifive_test device to trigger a system
reset for testing purpose.

Signed-off-by: Bin Meng 
---

 hw/riscv/sifive_test.c | 4 
 include/hw/riscv/sifive_test.h | 3 ++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/riscv/sifive_test.c b/hw/riscv/sifive_test.c
index 24a04d7..cd86831 100644
--- a/hw/riscv/sifive_test.c
+++ b/hw/riscv/sifive_test.c
@@ -21,6 +21,7 @@
 #include "qemu/osdep.h"
 #include "hw/sysbus.h"
 #include "qemu/module.h"
+#include "sysemu/sysemu.h"
 #include "target/riscv/cpu.h"
 #include "hw/riscv/sifive_test.h"

@@ -40,6 +41,9 @@ static void sifive_test_write(void *opaque, hwaddr addr,
 exit(code);
 case FINISHER_PASS:
 exit(0);
+case FINISHER_RESET:
+qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
+return;
 default:
 break;
 }
diff --git a/include/hw/riscv/sifive_test.h b/include/hw/riscv/sifive_test.h
index 71d4c9f..c186a31 100644
--- a/include/hw/riscv/sifive_test.h
+++ b/include/hw/riscv/sifive_test.h
@@ -34,7 +34,8 @@ typedef struct SiFiveTestState {

 enum {
 FINISHER_FAIL = 0x,
-FINISHER_PASS = 0x
+FINISHER_PASS = 0x,
+FINISHER_RESET = 0x
 };

 DeviceState *sifive_test_create(hwaddr addr);


Reviewed-by: Palmer Dabbelt 

Sorry this took a while, but it's in the hardware now.  I'll merge this, but
I'm considering it a new feature so it'll be held off a bit.

Thanks!



Re: [Qemu-devel] [PATCH 3/3] migration/savevm: move non SaveStateEntry condition check out of iteration

2019-07-19 Thread Wei Yang
On Fri, Jul 19, 2019 at 05:59:50PM +0100, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> in_postcopy and iterable_only are not SaveStateEntry specific, it would
>> be more proper to check them out of iteration.
>> 
>> Signed-off-by: Wei Yang 
>
>Worth it just to make that big if statement simpler!
>

Yep, the original one looks a little hard to understand.

Thanks for your comment.

>
>Reviewed-by: Dr. David Alan Gilbert 
>
>> ---
>>  migration/savevm.c | 13 ++---
>>  1 file changed, 6 insertions(+), 7 deletions(-)
>> 
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index c41e13e322..8a2ada529e 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1247,8 +1247,7 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
>>  }
>>  
>>  static
>> -int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool 
>> in_postcopy,
>> -bool iterable_only)
>> +int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool 
>> in_postcopy)
>>  {
>>  SaveStateEntry *se;
>>  int ret;
>> @@ -1257,7 +1256,6 @@ int 
>> qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy,
>>  if (!se->ops ||
>>  (in_postcopy && se->ops->has_postcopy &&
>>   se->ops->has_postcopy(se->opaque)) ||
>> -(in_postcopy && !iterable_only) ||
>>  !se->ops->save_live_complete_precopy) {
>>  continue;
>>  }
>> @@ -1369,10 +1367,11 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
>> bool iterable_only,
>>  
>>  cpu_synchronize_all_states();
>>  
>> -ret = qemu_savevm_state_complete_precopy_iterable(f, in_postcopy,
>> -  iterable_only);
>> -if (ret) {
>> -return ret;
>> +if (!in_postcopy || iterable_only) {
>> +ret = qemu_savevm_state_complete_precopy_iterable(f, in_postcopy);
>> +if (ret) {
>> +return ret;
>> +}
>>  }
>>  
>>  if (iterable_only) {
>> -- 
>> 2.17.1
>> 
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH 1/3] migration/savevm: flush file for iterable_only case

2019-07-19 Thread Wei Yang
On Fri, Jul 19, 2019 at 05:47:59PM +0100, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> It would be proper to flush file even for iterable_only case.
>> 
>> Signed-off-by: Wei Yang 
>
>OK, I don't think this is actually necessary; but it's safe.
>We only really need the flush at the end of the file, and in the
>non-iterable-only case it's not the end of the file.
>
>Since it makes your next change simpler,

Yep, you are so kind.

>
>Reviewed-by: Dr. David Alan Gilbert 
>
>> ---
>>  migration/savevm.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index c0e557b4c2..becedcc1c6 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -1292,7 +1292,7 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
>> bool iterable_only,
>>  }
>>  
>>  if (iterable_only) {
>> -return 0;
>> +goto flush;
>>  }
>>  
>>  vmdesc = qjson_new();
>> @@ -1353,6 +1353,7 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
>> bool iterable_only,
>>  }
>>  qjson_destroy(vmdesc);
>>  
>> +flush:
>>  qemu_fflush(f);
>>  return 0;
>>  }
>> -- 
>> 2.17.1
>> 
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH] migration: current_migration is never NULL

2019-07-19 Thread Wei Yang
On Fri, Jul 19, 2019 at 04:10:02PM +0100, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> migration_object_init() create and assign current_migration, which means
>> it will never be null until migration_shutdown().
>> 
>> Signed-off-by: Wei Yang 
>> ---
>>  migration/migration.c | 4 
>>  1 file changed, 4 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index 0fd2364961..43fd8297ef 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -1667,10 +1667,6 @@ bool migration_is_idle(void)
>>  {
>>  MigrationState *s = current_migration;
>>  
>> -if (!s) {
>> -return true;
>> -}
>> -
>
>I'd prefer to keep that because it's used by migrate_add_blocker
>and without this check it means we'd only be able to add a blocker
>after the migration object init - which is probably fine but we
>would have to check all the cases and make sure no one breaks it in
>the future;  where as this check makes it just work and we don't
>need to worry about the order.
>

Reasonable, Thanks :-)

>Dave
>
>>  switch (s->state) {
>>  case MIGRATION_STATUS_NONE:
>>  case MIGRATION_STATUS_CANCELLED:
>> -- 
>> 2.19.1
>> 
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me



Re: [Qemu-devel] [PATCH] migration: remove unused field bytes_xfer

2019-07-19 Thread Wei Yang
On Fri, Jul 19, 2019 at 07:05:44PM +0100, Dr. David Alan Gilbert wrote:
>* Wei Yang (richardw.y...@linux.intel.com) wrote:
>> On Tue, Jun 11, 2019 at 10:33:29AM +0200, Juan Quintela wrote:
>> >Wei Yang  wrote:
>> >> On Tue, Apr 02, 2019 at 08:31:06AM +0800, Wei Yang wrote:
>> >>>MigrationState->bytes_xfer is only set to 0 in migrate_init().
>> >>>
>> >>>Remove this unnecessary field.
>> >>>
>> >>>Signed-off-by: Wei Yang 
>> >>
>> >> Hi, David
>> >
>> >Hi
>> >
>> >I am on duty this week, will get it.
>> >
>> 
>> Hi, Juan
>> 
>> Sounds we lost this one and the one above this :-)
>> 
>
>We're in freeze at the moment; we'll pick it up as soon
>as it opens up again; I've got quite a list of your clean-up patches!
>

Thanks for your reply.

Have a good weekend. :-)

>Dave
>
>> -- 
>> Wei Yang
>> Help you, Help me
>--
>Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

-- 
Wei Yang
Help you, Help me



[Qemu-devel] [PATCH qemu RFC 2/4] spapr: Allow bios-less configuration

2019-07-19 Thread Alexey Kardashevskiy
The pseries kernel can do either usual prom-init boot or kexec style boot.
We always did the prom-init which relies on the completeness of
the device tree (for example, PCI BARs have to be assigned beforehand) and
the client interface; the system firmware (SLOF) implements this.

However we can use the kexec style boot as well. To do that, we can skip
loading SLOF and jump straight to the kernel. GPR5==0 (the client
interface entry point, SLOF passes a valid pointer there) tells Linux to
do the kexec boot rather than prom_init so it can proceed to the initramfs.
With few PCI fixes in the guest kernel, it can boot from PCI (via
petitboot, for example).

This adds a "bios" machine option which controls whether QEMU loads SLOF
or jumps directly to the kernel. When bios==off, this does not copy SLOF
and RTAS into the guest RAM and sets RTAS properties to 0 to bypass
the kexec user space tool which checks for their presence (not for
the values though).

Signed-off-by: Alexey Kardashevskiy 
---
 include/hw/ppc/spapr.h |  1 +
 hw/ppc/spapr.c | 58 --
 2 files changed, 45 insertions(+), 14 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index ff82bb8554e1..7f5d7a70d27e 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -160,6 +160,7 @@ struct SpaprMachineState {
 long kernel_size;
 bool kernel_le;
 uint64_t kernel_addr;
+bool bios_enabled;
 uint32_t initrd_base;
 long initrd_size;
 uint64_t rtc_offset; /* Now used only during incoming migration */
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6d13d65d8996..81ad6a6f28de 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1116,6 +1116,10 @@ static void spapr_dt_rtas(SpaprMachineState *spapr, void 
*fdt)
 _FDT(fdt_setprop(fdt, rtas, "ibm,lrdr-capacity",
  lrdr_capacity, sizeof(lrdr_capacity)));
 
+/* These are to make kexec-lite happy */
+_FDT(fdt_setprop_cell(fdt, rtas, "linux,rtas-base", 0));
+_FDT(fdt_setprop_cell(fdt, rtas, "rtas-size", 0));
+
 spapr_dt_rtas_tokens(fdt, rtas);
 }
 
@@ -1814,7 +1818,11 @@ static void spapr_machine_reset(MachineState *machine)
 spapr->fdt_blob = fdt;
 
 /* Set up the entry state */
-spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
+if (!spapr->bios_enabled) {
+spapr_cpu_set_entry_state(first_ppc_cpu, spapr->kernel_addr, fdt_addr);
+} else {
+spapr_cpu_set_entry_state(first_ppc_cpu, SPAPR_ENTRY_POINT, fdt_addr);
+}
 first_ppc_cpu->env.gpr[5] = 0;
 
 spapr->cas_reboot = false;
@@ -3031,20 +3039,22 @@ static void spapr_machine_init(MachineState *machine)
 }
 }
 
-if (bios_name == NULL) {
-bios_name = FW_FILE_NAME;
+if (spapr->bios_enabled) {
+if (bios_name == NULL) {
+bios_name = FW_FILE_NAME;
+}
+filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
+if (!filename) {
+error_report("Could not find LPAR firmware '%s'", bios_name);
+exit(1);
+}
+fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
+if (fw_size <= 0) {
+error_report("Could not load LPAR firmware '%s'", filename);
+exit(1);
+}
+g_free(filename);
 }
-filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
-if (!filename) {
-error_report("Could not find LPAR firmware '%s'", bios_name);
-exit(1);
-}
-fw_size = load_image_targphys(filename, 0, FW_MAX_SIZE);
-if (fw_size <= 0) {
-error_report("Could not load LPAR firmware '%s'", filename);
-exit(1);
-}
-g_free(filename);
 
 /* FIXME: Should register things through the MachineState's qdev
  * interface, this is a legacy from the sPAPREnvironment structure
@@ -3266,6 +3276,20 @@ static void spapr_set_kernel_addr(Object *obj, Visitor 
*v, const char *name,
 visit_type_uint64(v, name, (uint64_t *)opaque, errp);
 }
 
+static bool spapr_get_bios_enabled(Object *obj, Error **errp)
+{
+SpaprMachineState *spapr = SPAPR_MACHINE(obj);
+
+return spapr->bios_enabled;
+}
+
+static void spapr_set_bios_enabled(Object *obj, bool value, Error **errp)
+{
+SpaprMachineState *spapr = SPAPR_MACHINE(obj);
+
+spapr->bios_enabled = value;
+}
+
 static char *spapr_get_ic_mode(Object *obj, Error **errp)
 {
 SpaprMachineState *spapr = SPAPR_MACHINE(obj);
@@ -3379,6 +3403,12 @@ static void spapr_instance_init(Object *obj)
 " for -kernel is the default",
 NULL);
 spapr->kernel_addr = KERNEL_LOAD_ADDR;
+object_property_add_bool(obj, "bios", spapr_get_bios_enabled,
+spapr_set_bios_enabled, NULL);
+object_property_set_description(obj, "bios", "Conrols whether to load 
bios",
+NULL);
+spapr->bios_enabled = true;
+
   

[Qemu-devel] [PATCH qemu RFC 4/4] spapr: Implement SLOF-less client_architecture_support

2019-07-19 Thread Alexey Kardashevskiy
QEMU already implements H_CAS called by SLOF. The existing handler
prepares a diff FDT and SLOF applies it on top of its current tree.
In SLOF-less setup when the user explicitly selected "bios=no",
this updates the FDT from the OS, updates it and writes back to the OS.
The new behavior is advertised to the OS via "/chosen/qemu,h_cas".

Signed-off-by: Alexey Kardashevskiy 
---
 include/hw/ppc/spapr.h |  5 +
 hw/ppc/spapr.c | 24 -
 hw/ppc/spapr_hcall.c   | 49 +++---
 3 files changed, 70 insertions(+), 8 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 7f5d7a70d27e..73cd9cf25b83 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -766,9 +766,14 @@ struct SpaprEventLogEntry {
 
 void spapr_events_init(SpaprMachineState *sm);
 void spapr_dt_events(SpaprMachineState *sm, void *fdt);
+int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
+SpaprOptionVector *ov5_updates);
 int spapr_h_cas_compose_response(SpaprMachineState *sm,
  target_ulong addr, target_ulong size,
  SpaprOptionVector *ov5_updates);
+#define FDT_MAX_SIZE 0x10
+void *spapr_build_fdt(SpaprMachineState *spapr);
+
 void close_htab_fd(SpaprMachineState *spapr);
 void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr);
 void spapr_free_hpt(SpaprMachineState *spapr);
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index b097a99951f1..f84895f4a8b4 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -978,6 +978,19 @@ static bool spapr_hotplugged_dev_before_cas(void)
 return false;
 }
 
+int spapr_h_cas_do_compose_response(SpaprMachineState *spapr, void *fdt,
+SpaprOptionVector *ov5_updates)
+{
+/* Fixup cpu nodes */
+_FDT((spapr_fixup_cpu_dt(fdt, spapr)));
+
+if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
+return -1;
+}
+
+return 0;
+}
+
 int spapr_h_cas_compose_response(SpaprMachineState *spapr,
  target_ulong addr, target_ulong size,
  SpaprOptionVector *ov5_updates)
@@ -1009,10 +1022,7 @@ int spapr_h_cas_compose_response(SpaprMachineState 
*spapr,
 _FDT((fdt_open_into(fdt_skel, fdt, size)));
 g_free(fdt_skel);
 
-/* Fixup cpu nodes */
-_FDT((spapr_fixup_cpu_dt(fdt, spapr)));
-
-if (spapr_dt_cas_updates(spapr, fdt, ov5_updates)) {
+if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
 return -1;
 }
 
@@ -1232,6 +1242,10 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, 
void *fdt)
 
 /* We always implemented RTAS as hcall, tell guests to call it directly */
 _FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
+/* Tell the guest that H_CAS will return the entire FDT now, not the diff 
*/
+if (!spapr->bios_enabled) {
+_FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_cas", 1));
+}
 
 spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 
@@ -1262,7 +1276,7 @@ static void spapr_dt_hypervisor(SpaprMachineState *spapr, 
void *fdt)
 }
 }
 
-static void *spapr_build_fdt(SpaprMachineState *spapr)
+void *spapr_build_fdt(SpaprMachineState *spapr)
 {
 MachineState *machine = MACHINE(spapr);
 MachineClass *mc = MACHINE_GET_CLASS(machine);
diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
index b964d94f330b..c5cb06c9d507 100644
--- a/hw/ppc/spapr_hcall.c
+++ b/hw/ppc/spapr_hcall.c
@@ -17,6 +17,7 @@
 #include "hw/ppc/spapr_ovec.h"
 #include "mmu-book3s-v3.h"
 #include "hw/mem/memory-device.h"
+#include "sysemu/device_tree.h"
 
 static bool has_spr(PowerPCCPU *cpu, int spr)
 {
@@ -1774,9 +1775,51 @@ static target_ulong 
h_client_architecture_support(PowerPCCPU *cpu,
 /* legacy hash or new hash: */
 spapr_setup_hpt_and_vrma(spapr);
 }
-spapr->cas_reboot =
-(spapr_h_cas_compose_response(spapr, args[1], args[2],
-  ov5_updates) != 0);
+
+if (spapr->bios_enabled) {
+spapr->cas_reboot =
+(spapr_h_cas_compose_response(spapr, args[1], args[2],
+  ov5_updates) != 0);
+} else {
+int size;
+void *fdt, *fdt_skel;
+struct fdt_header hdr = { 0 };
+
+cpu_physical_memory_read(args[1], , sizeof(hdr));
+size = fdt32_to_cpu(hdr.totalsize);
+if (size > FDT_MAX_SIZE) {
+return H_NOT_AVAILABLE;
+}
+
+fdt_skel = g_malloc0(size);
+cpu_physical_memory_read(args[1], fdt_skel, size);
+
+fdt = g_malloc0(FDT_MAX_SIZE);
+fdt_open_into(fdt_skel, fdt, FDT_MAX_SIZE);
+g_free(fdt_skel);
+
+if (spapr_h_cas_do_compose_response(spapr, fdt, ov5_updates)) {
+g_free(fdt);
+  

[Qemu-devel] [PATCH qemu RFC 1/4] spapr: Allow changing kernel loading address

2019-07-19 Thread Alexey Kardashevskiy
Useful for the debugging purposes.

Signed-off-by: Alexey Kardashevskiy 
---
 include/hw/ppc/spapr.h |  1 +
 hw/ppc/spapr.c | 33 +++--
 2 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 74e427b588fc..ff82bb8554e1 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -159,6 +159,7 @@ struct SpaprMachineState {
 void *fdt_blob;
 long kernel_size;
 bool kernel_le;
+uint64_t kernel_addr;
 uint32_t initrd_base;
 long initrd_size;
 uint64_t rtc_offset; /* Now used only during incoming migration */
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7fad42350538..6d13d65d8996 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1179,7 +1179,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, 
void *fdt)
   spapr->initrd_base + spapr->initrd_size));
 
 if (spapr->kernel_size) {
-uint64_t kprop[2] = { cpu_to_be64(KERNEL_LOAD_ADDR),
+uint64_t kprop[2] = { cpu_to_be64(spapr->kernel_addr),
   cpu_to_be64(spapr->kernel_size) };
 
 _FDT(fdt_setprop(fdt, chosen, "qemu,boot-kernel",
@@ -1365,7 +1365,7 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
 
 /* Build memory reserve map */
 if (spapr->kernel_size) {
-_FDT((fdt_add_mem_rsv(fdt, KERNEL_LOAD_ADDR, spapr->kernel_size)));
+_FDT((fdt_add_mem_rsv(fdt, spapr->kernel_addr, spapr->kernel_size)));
 }
 if (spapr->initrd_size) {
 _FDT((fdt_add_mem_rsv(fdt, spapr->initrd_base, spapr->initrd_size)));
@@ -1391,7 +1391,8 @@ static void *spapr_build_fdt(SpaprMachineState *spapr)
 
 static uint64_t translate_kernel_address(void *opaque, uint64_t addr)
 {
-return (addr & 0x0fff) + KERNEL_LOAD_ADDR;
+SpaprMachineState *spapr = opaque;
+return (addr & 0x0fff) + spapr->kernel_addr;
 }
 
 static void emulate_spapr_hypercall(PPCVirtualHypervisor *vhyp,
@@ -2995,12 +2996,12 @@ static void spapr_machine_init(MachineState *machine)
 uint64_t lowaddr = 0;
 
 spapr->kernel_size = load_elf(kernel_filename, NULL,
-  translate_kernel_address, NULL,
+  translate_kernel_address, spapr,
   NULL, , NULL, 1,
   PPC_ELF_MACHINE, 0, 0);
 if (spapr->kernel_size == ELF_LOAD_WRONG_ENDIAN) {
 spapr->kernel_size = load_elf(kernel_filename, NULL,
-  translate_kernel_address, NULL, NULL,
+  translate_kernel_address, spapr, 
NULL,
   , NULL, 0, PPC_ELF_MACHINE,
   0, 0);
 spapr->kernel_le = spapr->kernel_size > 0;
@@ -3016,7 +3017,7 @@ static void spapr_machine_init(MachineState *machine)
 /* Try to locate the initrd in the gap between the kernel
  * and the firmware. Add a bit of space just in case
  */
-spapr->initrd_base = (KERNEL_LOAD_ADDR + spapr->kernel_size
+spapr->initrd_base = (spapr->kernel_addr + spapr->kernel_size
   + 0x1) & ~0x;
 spapr->initrd_size = load_image_targphys(initrd_filename,
  spapr->initrd_base,
@@ -3253,6 +3254,18 @@ static void spapr_set_vsmt(Object *obj, Visitor *v, 
const char *name,
 visit_type_uint32(v, name, (uint32_t *)opaque, errp);
 }
 
+static void spapr_get_kernel_addr(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+visit_type_uint64(v, name, (uint64_t *)opaque, errp);
+}
+
+static void spapr_set_kernel_addr(Object *obj, Visitor *v, const char *name,
+  void *opaque, Error **errp)
+{
+visit_type_uint64(v, name, (uint64_t *)opaque, errp);
+}
+
 static char *spapr_get_ic_mode(Object *obj, Error **errp)
 {
 SpaprMachineState *spapr = SPAPR_MACHINE(obj);
@@ -3358,6 +3371,14 @@ static void spapr_instance_init(Object *obj)
 object_property_add_bool(obj, "vfio-no-msix-emulation",
  spapr_get_msix_emulation, NULL, NULL);
 
+object_property_add(obj, "kernel-addr", "uint64", spapr_get_kernel_addr,
+spapr_set_kernel_addr, NULL, >kernel_addr,
+_abort);
+object_property_set_description(obj, "kernel-addr",
+stringify(KERNEL_LOAD_ADDR)
+" for -kernel is the default",
+NULL);
+spapr->kernel_addr = KERNEL_LOAD_ADDR;
 /* The machine class defines the default interrupt controller mode */
 spapr->irq = smc->irq;
 object_property_add_str(obj, "ic-mode", spapr_get_ic_mode,

[Qemu-devel] [PATCH qemu RFC 0/4] spapr: Kexec style boot

2019-07-19 Thread Alexey Kardashevskiy
This is an attempts to boot a pseries guest without a firmware.

The idea is to boot a VM with petitboot as a more powerful boot loader
and eliminate scanning phase of the SLOF booting process.

This provides environment without SLOF but with the device tree
and few modifications to already existing H_CAS and H_RTAS hypeprcalls.

This is made on top of these 2 patches:
spapr: Stop providing RTAS blob
spapr_pci: Advertise BAR reallocation capability

This requires a modified pseries kernel, posted separately
as "powerpc/pseries: Kexec style boot".

This also requires patched kexec-lite from the openpower build as
currently it requires linux,rtas-base and rtas-size properties in
the DT which we won't have in such environment.

1/1 is not necessary but having vmlinux at 0 helps debugging via
the qemu gdb stub easier.

The example command line is:
/home/aik/pbuild/qemu-killslof-localhost-ppc64/ppc64-softmmu/qemu-system-ppc64 \
-nodefaults \
-chardev stdio,id=STDIO0,signal=off,mux=on \
-device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \
-mon id=MON0,chardev=STDIO0,mode=readline -nographic -vga none \
-enable-kvm \
-device nec-usb-xhci,id=nec-usb-xhci0 -m 4G \
-kernel /home/aik/pbuild/kernel-guest-nv2-le/vmlinux \
-initrd op-build/output/images/rootfs.cpio \
-machine pseries,bios=no,cap-cfpc=broken,cap-sbbc=broken,cap-ibs=broken \
-snapshot \
-netdev "tap,id=TAP0,helper=/home/aik/qemu-bridge-helper --br=br0" \
-device \
"virtio-net-pci,id=vnet0,bus=pci.0,addr=1.0,mac=C0:41:49:4b:00:02,netdev=TAP0" \
-device virtio-scsi-pci,id=vscsi0 \
-drive \
id=DRIVE0,if=none,file=pbuild/__img/u1804-64le-killslof.qcow2,format=qcow2 \
-device scsi-disk,id=scsi-disk0,drive=DRIVE0 \
-smp 8,threads=8 \
-L /home/aik/t/qemu-ppc64-bios/ \
-trace events=qemu_trace_events -d guest_errors \
-chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.mac02 \
-mon chardev=SOCKET0,mode=control



This is based on sha1
216965b20b04 Joel Stanley "ppc/pnv: update skiboot to v6.4".

Please comment. Thanks.



Alexey Kardashevskiy (4):
  spapr: Allow changing kernel loading address
  spapr: Allow bios-less configuration
  spapr: Advertise H_RTAS to the guest
  spapr: Implement SLOF-less client_architecture_support

 include/hw/ppc/spapr.h |   7 +++
 hw/ppc/spapr.c | 118 -
 hw/ppc/spapr_hcall.c   |  49 +++--
 3 files changed, 146 insertions(+), 28 deletions(-)

-- 
2.17.1




[Qemu-devel] [PATCH qemu RFC 3/4] spapr: Advertise H_RTAS to the guest

2019-07-19 Thread Alexey Kardashevskiy
Since day 1 QEMU implemented RTAS as a custom hypercall wrapped into
a small 20 bytes blob which guest would call to enter RTAS. Although
it works fine, it is still a separate binary image which requires signing
at no additional benefit.

This adds a flag into /chosen to tell a modified guest that if the flag
is there, it can call H_RTAS directly and avoid calling into the RTAS
blob.

Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 81ad6a6f28de..b097a99951f1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1230,6 +1230,9 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, 
void *fdt)
 _FDT(fdt_setprop_cell(fdt, chosen, "linux,pci-probe-only", 0));
 }
 
+/* We always implemented RTAS as hcall, tell guests to call it directly */
+_FDT(fdt_setprop_cell(fdt, chosen, "qemu,h_rtas", 1));
+
 spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 
 g_free(stdout_path);
-- 
2.17.1




[Qemu-devel] [PATCH v4 5/7] log: adding -d tb_stats to control tbstats

2019-07-19 Thread vandersonmr
Adding -d tb_stats:[limit:[all|jit|exec]] to control TBStatistics
collection. "limit" is used to limit the number of TBStats in the
linux-user dump. [all|jit|exec] control the profilling level used
by the TBStats: all, only jit stats or only execution count stats.

Signed-off-by: Vanderson M. do Rosario 
---
 include/qemu/log.h |  1 +
 util/log.c | 34 ++
 2 files changed, 35 insertions(+)

diff --git a/include/qemu/log.h b/include/qemu/log.h
index 8cdfc268ca..93642603e5 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -45,6 +45,7 @@ static inline bool qemu_log_separate(void)
 /* LOG_TRACE (1 << 15) is defined in log-for-trace.h */
 #define CPU_LOG_TB_OP_IND  (1 << 16)
 #define CPU_LOG_TB_FPU (1 << 17)
+#define CPU_LOG_TB_STATS   (1 << 18)
 
 /* Lock output for a series of related logs.  Since this is not needed
  * for a single qemu_log / qemu_log_mask / qemu_log_mask_and_addr, we
diff --git a/util/log.c b/util/log.c
index f81653d712..8109d19afb 100644
--- a/util/log.c
+++ b/util/log.c
@@ -30,6 +30,7 @@ FILE *qemu_logfile;
 int qemu_loglevel;
 static int log_append = 0;
 static GArray *debug_regions;
+int32_t max_num_hot_tbs_to_dump;
 
 /* Return the number of characters emitted.  */
 int qemu_log(const char *fmt, ...)
@@ -273,6 +274,8 @@ const QEMULogItem qemu_log_items[] = {
 { CPU_LOG_TB_NOCHAIN, "nochain",
   "do not chain compiled TBs so that \"exec\" and \"cpu\" show\n"
   "complete traces" },
+{ CPU_LOG_TB_STATS, "tb_stats[:limit:(all,jit,exec)]",
+  "show TBs statistics (until given a limit) ordered by their hotness.\n" 
},
 { 0, NULL, NULL },
 };
 
@@ -294,6 +297,37 @@ int qemu_str_to_log_mask(const char *str)
 trace_enable_events((*tmp) + 6);
 mask |= LOG_TRACE;
 #endif
+} else if (g_str_has_prefix(*tmp, "tb_stats")) {
+if (g_str_has_prefix(*tmp, "tb_stats:") && (*tmp)[9] != '\0') {
+
+if (!g_ascii_isdigit(*((*tmp) + 9))) {
+fprintf(stderr,
+"should be a number follow by [all|jit|exec], as 
tb_stats:10:all\n");
+exit(1);
+}
+/* get limit */
+max_num_hot_tbs_to_dump = atoi((*tmp) + 9);
+
+/* get profilling level */
+char *s = (*tmp) + 9;
+while (*s != '\0') {
+if (*s++ == ':') {
+break;
+}
+}
+if (g_str_equal(s, "jit") == 0) {
+set_default_tbstats_flag(TB_JIT_STATS);
+} else if (g_str_equal(s, "exec") == 0) {
+set_default_tbstats_flag(TB_EXEC_STATS);
+} else {
+set_default_tbstats_flag(TB_JIT_STATS | TB_EXEC_STATS);
+}
+} else {
+set_default_tbstats_flag(TB_JIT_STATS | TB_EXEC_STATS);
+}
+
+mask |= CPU_LOG_TB_STATS;
+enable_collect_tb_stats();
 } else {
 for (item = qemu_log_items; item->mask != 0; item++) {
 if (g_str_equal(*tmp, item->name)) {
-- 
2.22.0




[Qemu-devel] [PATCH v4 2/7] accel: collecting TB execution count

2019-07-19 Thread vandersonmr
If a TB has a TBS (TBStatistics) with the TB_EXEC_STATS
enabled, then we instrument the start code of the TB
to atomically count the number of times it is executed.
The execution count of the TB is stored in its respective
TBS.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tcg-runtime.c   |  7 +++
 accel/tcg/tcg-runtime.h   |  2 ++
 accel/tcg/translate-all.c |  8 
 accel/tcg/translator.c|  1 +
 include/exec/gen-icount.h |  9 +
 include/exec/tb-stats.h   | 11 +++
 include/qemu/log.h|  6 ++
 util/log.c| 11 +++
 8 files changed, 55 insertions(+)

diff --git a/accel/tcg/tcg-runtime.c b/accel/tcg/tcg-runtime.c
index 8a1e408e31..f332eae334 100644
--- a/accel/tcg/tcg-runtime.c
+++ b/accel/tcg/tcg-runtime.c
@@ -167,3 +167,10 @@ void HELPER(exit_atomic)(CPUArchState *env)
 {
 cpu_loop_exit_atomic(env_cpu(env), GETPC());
 }
+
+void HELPER(inc_exec_freq)(void *ptr)
+{
+TBStatistics *stats = (TBStatistics *) ptr;
+g_assert(stats);
+atomic_inc(>executions.total);
+}
diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h
index 4fa61b49b4..bf0b75dbe8 100644
--- a/accel/tcg/tcg-runtime.h
+++ b/accel/tcg/tcg-runtime.h
@@ -28,6 +28,8 @@ DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, ptr, env)
 
 DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env)
 
+DEF_HELPER_FLAGS_1(inc_exec_freq, TCG_CALL_NO_RWG, void, ptr)
+
 #ifdef CONFIG_SOFTMMU
 
 DEF_HELPER_FLAGS_5(atomic_cmpxchgb, TCG_CALL_NO_WG,
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index a574890a80..7497dae508 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1785,6 +1785,14 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
  */
 if (tb_stats_collection_enabled()) {
 tb->tb_stats = tb_get_stats(phys_pc, pc, cs_base, flags);
+uint32_t flag = get_default_tbstats_flag();
+
+if (qemu_log_in_addr_range(tb->pc)) {
+if (flag & TB_EXEC_STATS) {
+tb->tb_stats->stats_enabled |= TB_EXEC_STATS;
+}
+}
+
 } else {
 tb->tb_stats = NULL;
 }
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 9226a348a3..396a11e828 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -46,6 +46,7 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 
 ops->init_disas_context(db, cpu);
 tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
+gen_tb_exec_count(tb);
 
 /* Reset the temp count so that we can identify leaks */
 tcg_clear_temp_count();
diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index f7669b6841..b3efe41894 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -7,6 +7,15 @@
 
 static TCGOp *icount_start_insn;
 
+static inline void gen_tb_exec_count(TranslationBlock *tb)
+{
+if (tb_stats_enabled(tb, TB_EXEC_STATS)) {
+TCGv_ptr ptr = tcg_const_ptr(tb->tb_stats);
+gen_helper_inc_exec_freq(ptr);
+tcg_temp_free_ptr(ptr);
+}
+}
+
 static inline void gen_tb_start(TranslationBlock *tb)
 {
 TCGv_i32 count, imm;
diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h
index 0913155ec3..ee1e8de0d3 100644
--- a/include/exec/tb-stats.h
+++ b/include/exec/tb-stats.h
@@ -6,6 +6,9 @@
 #include "exec/tb-context.h"
 #include "tcg.h"
 
+#define tb_stats_enabled(tb, JIT_STATS) \
+(tb && tb->tb_stats && (tb->tb_stats->stats_enabled & JIT_STATS))
+
 typedef struct TBStatistics TBStatistics;
 
 /*
@@ -22,6 +25,14 @@ struct TBStatistics {
 uint32_t flags;
 /* cs_base isn't included in the hash but we do check for matches */
 target_ulong cs_base;
+
+uint32_t stats_enabled;
+
+/* Execution stats */
+struct {
+unsigned long total;
+unsigned long atomic;
+} executions;
 };
 
 bool tb_stats_cmp(const void *ap, const void *bp);
diff --git a/include/qemu/log.h b/include/qemu/log.h
index e175d4d5d0..b213411836 100644
--- a/include/qemu/log.h
+++ b/include/qemu/log.h
@@ -129,10 +129,16 @@ void qemu_log_flush(void);
 /* Close the log file */
 void qemu_log_close(void);
 
+#define TB_NOTHING0
+#define TB_EXEC_STATS (1 << 1)
+
 void enable_collect_tb_stats(void);
 void disable_collect_tb_stats(void);
 void pause_collect_tb_stats(void);
 bool tb_stats_collection_enabled(void);
 bool tb_stats_collection_paused(void);
 
+void set_default_tbstats_flag(uint32_t flag);
+uint32_t get_default_tbstats_flag(void);
+
 #endif
diff --git a/util/log.c b/util/log.c
index ab73fdc100..f81653d712 100644
--- a/util/log.c
+++ b/util/log.c
@@ -354,3 +354,14 @@ bool tb_stats_collection_paused(void)
 return tcg_collect_tb_stats == 2;
 }
 
+uint32_t default_tbstats_flag;
+
+void set_default_tbstats_flag(uint32_t flag)
+{
+default_tbstats_flag = flag;
+}
+
+uint32_t get_default_tbstats_flag(void)
+{
+return default_tbstats_flag;
+}
-- 
2.22.0




[Qemu-devel] [PATCH v4 6/7] monitor: adding tb_stats hmp command

2019-07-19 Thread vandersonmr
Adding tb_stats [start|pause|stop|filter] command to hmp.
This allows controlling the collection of statistics.
It is also possible to set the level of collection:
all, jit, or exec.

The goal of this command is to allow the dynamic exploration
of the TCG behavior and quality. Therefore, for now, a
corresponding QMP command is not worthwhile.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/tb-stats.c| 107 
 hmp-commands.hx |  17 +++
 include/exec/tb-stats.h |  13 +
 include/qemu/log.h  |   1 +
 monitor/misc.c  |  40 +++
 5 files changed, 178 insertions(+)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index 44497d4f9b..6c330e1b02 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -6,6 +6,9 @@
 
 #include "qemu/qemu-print.h"
 
+/* only accessed in safe work */
+static GList *last_search;
+
 struct jit_profile_info {
 uint64_t translations;
 uint64_t aborted;
@@ -104,4 +107,108 @@ void dump_jit_profile_info(TCGProfile *s)
 }
 }
 
+static void dessaloc_tbstats(void *p, uint32_t hash, void *userp)
+{
+g_free(p);
+}
+
+void clean_tbstats(void)
+{
+/* remove all tb_stats */
+qht_iter(_ctx.tb_stats, dessaloc_tbstats, NULL);
+qht_destroy(_ctx.tb_stats);
+}
+
+static void reset_tbstats_flag(void *p, uint32_t hash, void *userp)
+{
+uint32_t flag = *((int *)userp);
+TBStatistics *tbs = p;
+tbs->stats_enabled = flag;
+}
+
+void set_tbstats_flags(uint32_t flag)
+{
+/* iterate over tbstats setting their flag as TB_NOTHING */
+qht_iter(_ctx.tb_stats, reset_tbstats_flag, );
+}
+
+void do_hmp_tbstats_safe(CPUState *cpu, run_on_cpu_data icmd)
+{
+struct TbstatsCommand *cmdinfo = icmd.host_ptr;
+int cmd = cmdinfo->cmd;
+uint32_t level = cmdinfo->level;
+
+/* for safe, always pause TB generation while running this commands */
+mmap_lock();
+
+switch (cmd) {
+case START:
+if (tb_stats_collection_paused()) {
+set_tbstats_flags(level);
+} else {
+if (tb_stats_collection_enabled()) {
+qemu_printf("TB information already being recorded");
+return;
+}
+qht_init(_ctx.tb_stats, tb_stats_cmp, CODE_GEN_HTABLE_SIZE,
+QHT_MODE_AUTO_RESIZE);
+}
+
+set_default_tbstats_flag(level);
+enable_collect_tb_stats();
+tb_flush(cpu);
+break;
+case PAUSE:
+if (!tb_stats_collection_enabled()) {
+qemu_printf("TB information not being recorded");
+return;
+}
+
+/* Continue to create TBStatistic structures but stop collecting 
statistics */
+pause_collect_tb_stats();
+tb_flush(cpu);
+set_default_tbstats_flag(TB_NOTHING);
+set_tbstats_flags(TB_PAUSED);
+break;
+case STOP:
+if (!tb_stats_collection_enabled()) {
+qemu_printf("TB information not being recorded");
+return;
+}
+
+/* Dissalloc all TBStatistics structures and stop creating new ones */
+disable_collect_tb_stats();
+tb_flush(cpu);
+clean_tbstats();
+break;
+case FILTER:
+if (!tb_stats_collection_enabled()) {
+qemu_printf("TB information not being recorded");
+return;
+}
+if (!last_search) {
+qemu_printf("no search on record! execute info tbs before 
filtering!");
+return;
+}
+
+tb_flush(cpu);
+set_default_tbstats_flag(TB_NOTHING);
+
+/* Set all tbstats as paused, then return only the ones from 
last_search */
+pause_collect_tb_stats();
+set_tbstats_flags(TB_PAUSED);
+
+for (GList *iter = last_search; iter; iter = g_list_next(iter)) {
+TBStatistics *tbs = iter->data;
+tbs->stats_enabled = level;
+}
+
+break;
+default: /* INVALID */
+break;
+}
+
+mmap_unlock();
+g_free(cmdinfo);
+}
 
diff --git a/hmp-commands.hx b/hmp-commands.hx
index bfa5681dd2..419898751e 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1885,6 +1885,23 @@ STEXI
 @findex qemu-io
 Executes a qemu-io command on the given block device.
 
+ETEXI
+#if defined(CONFIG_TCG)
+{
+.name   = "tb_stats",
+.args_type  = "command:s,level:s?",
+.params = "command [stats_level]",
+.help   = "Control tb statistics collection:"
+"tb_stats (start|pause|stop|filter) 
[all|jit_stats|exec_stats]",
+.cmd= hmp_tbstats,
+},
+#endif
+
+STEXI
+@item tb_stats
+@findex
+Control recording tb statistics
+
 ETEXI
 
 {
diff --git a/include/exec/tb-stats.h b/include/exec/tb-stats.h
index 7d775f2c0d..d1debd3262 100644
--- a/include/exec/tb-stats.h
+++ b/include/exec/tb-stats.h
@@ -6,6 +6,9 @@
 #include "exec/tb-context.h"
 #include "tcg.h"
 
+enum SortBy 

[Qemu-devel] [PATCH v4 4/7] accel: replacing part of CONFIG_PROFILER with TBStats

2019-07-19 Thread vandersonmr
We add some of the statistics collected in the TCGProfiler
into the TBStats, having the statistics not only for the whole
emulation but for each TB. Then, we removed these stats
from TCGProfiler and reconstruct the information for the
"info jit" using the sum of all TBStats statistics.

The goal is to have one unique and better way of collecting
emulation statistics. Moreover, checking dynamiclly if the
profiling is enabled showed to have an insignificant impact
on the performance:
https://wiki.qemu.org/Internships/ProjectIdeas/TCGCodeQuality#Overheads.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/Makefile.objs   |   2 +-
 accel/tcg/tb-stats.c  | 107 ++
 accel/tcg/translate-all.c |  10 ++--
 include/exec/tb-stats.h   |   9 
 tcg/tcg.c |  93 -
 tcg/tcg.h |  10 
 6 files changed, 131 insertions(+), 100 deletions(-)
 create mode 100644 accel/tcg/tb-stats.c

diff --git a/accel/tcg/Makefile.objs b/accel/tcg/Makefile.objs
index d381a02f34..49ffe81b5d 100644
--- a/accel/tcg/Makefile.objs
+++ b/accel/tcg/Makefile.objs
@@ -2,7 +2,7 @@ obj-$(CONFIG_SOFTMMU) += tcg-all.o
 obj-$(CONFIG_SOFTMMU) += cputlb.o
 obj-y += tcg-runtime.o tcg-runtime-gvec.o
 obj-y += cpu-exec.o cpu-exec-common.o translate-all.o
-obj-y += translator.o
+obj-y += translator.o tb-stats.o
 
 obj-$(CONFIG_USER_ONLY) += user-exec.o
 obj-$(call lnot,$(CONFIG_SOFTMMU)) += user-exec-stub.o
diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
new file mode 100644
index 00..44497d4f9b
--- /dev/null
+++ b/accel/tcg/tb-stats.c
@@ -0,0 +1,107 @@
+#include "qemu/osdep.h"
+
+#include "disas/disas.h"
+#include "exec/exec-all.h"
+#include "tcg.h"
+
+#include "qemu/qemu-print.h"
+
+struct jit_profile_info {
+uint64_t translations;
+uint64_t aborted;
+uint64_t ops;
+unsigned ops_max;
+uint64_t del_ops;
+uint64_t temps;
+unsigned temps_max;
+uint64_t host;
+uint64_t guest;
+uint64_t host_ins;
+uint64_t search_data;
+};
+
+/* accumulate the statistics from all TBs */
+static void collect_jit_profile_info(void *p, uint32_t hash, void *userp)
+{
+struct jit_profile_info *jpi = userp;
+TBStatistics *tbs = p;
+
+jpi->translations += tbs->translations.total;
+jpi->ops += tbs->code.num_tcg_ops;
+if (tbs->translations.total && tbs->code.num_tcg_ops / 
tbs->translations.total
+> jpi->ops_max) {
+jpi->ops_max = tbs->code.num_tcg_ops / tbs->translations.total;
+}
+jpi->del_ops += tbs->code.deleted_ops;
+jpi->temps += tbs->code.temps;
+if (tbs->translations.total && tbs->code.temps / tbs->translations.total >
+jpi->temps_max) {
+jpi->temps_max = tbs->code.temps / tbs->translations.total;
+}
+jpi->host += tbs->code.out_len;
+jpi->guest += tbs->code.in_len;
+jpi->host_ins += tbs->code.num_host_inst;
+jpi->search_data += tbs->code.search_out_len;
+}
+
+/* dump JIT statisticis using TCGProfile and TBStats */
+void dump_jit_profile_info(TCGProfile *s)
+{
+if (!tb_stats_collection_enabled()) {
+return;
+}
+
+struct jit_profile_info *jpi = g_new0(struct jit_profile_info, 1);
+
+qht_iter(_ctx.tb_stats, collect_jit_profile_info, jpi);
+
+if (jpi->translations) {
+qemu_printf("translated TBs  %" PRId64 "\n", jpi->translations);
+qemu_printf("avg ops/TB  %0.1f max=%d\n",
+jpi->ops / (double) jpi->translations, jpi->ops_max);
+qemu_printf("deleted ops/TB  %0.2f\n",
+jpi->del_ops / (double) jpi->translations);
+qemu_printf("avg temps/TB%0.2f max=%d\n",
+jpi->temps / (double) jpi->translations, jpi->temps_max);
+qemu_printf("avg host code/TB%0.1f\n",
+jpi->host / (double) jpi->translations);
+qemu_printf("avg host ins/TB %0.1f\n",
+jpi->host_ins / (double) jpi->translations);
+qemu_printf("avg search data/TB  %0.1f\n",
+jpi->search_data / (double) jpi->translations);
+
+if (s) {
+int64_t tot = s->interm_time + s->code_time;
+qemu_printf("JIT cycles  %" PRId64 " (%0.3f s at 2.4 
GHz)\n",
+tot, tot / 2.4e9);
+qemu_printf("cycles/op   %0.1f\n",
+jpi->ops ? (double)tot / jpi->ops : 0);
+qemu_printf("cycles/in byte  %0.1f\n",
+jpi->guest ? (double)tot / jpi->guest : 0);
+qemu_printf("cycles/out byte %0.1f\n",
+jpi->host ? (double)tot / jpi->host : 0);
+qemu_printf("cycles/out inst %0.1f\n",
+jpi->host_ins ? (double)tot / jpi->host_ins : 0);
+qemu_printf("cycles/search byte %0.1f\n",
+jpi->search_data ? (double)tot / jpi->search_data : 0);
+   

[Qemu-devel] [PATCH v4 1/7] accel: introducing TBStatistics structure

2019-07-19 Thread vandersonmr
To store statistics for each TB we created a TBStatistics structure
which is linked with the TBs. The TBStatistics can stay alive after
tb_flush and be relinked to a regenerated TB. So the statistics can
be accumulated even through flushes.

TBStatistics will be also referred to as TBS or tbstats.

Signed-off-by: Vanderson M. do Rosario 
---
 accel/tcg/translate-all.c | 57 +++
 include/exec/exec-all.h   | 15 +++
 include/exec/tb-context.h | 12 +
 include/exec/tb-hash.h|  7 +
 include/exec/tb-stats.h   | 29 
 include/qemu/log.h|  6 +
 util/log.c| 28 +++
 7 files changed, 143 insertions(+), 11 deletions(-)
 create mode 100644 include/exec/tb-stats.h

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 5d1e08b169..a574890a80 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1118,6 +1118,23 @@ static inline void code_gen_alloc(size_t tb_size)
 }
 }
 
+/*
+ * This is the more or less the same compare, but the data persists
+ * over tb_flush. We also aggregate the various variations of cflags
+ * under one record and ignore the details of page overlap (although
+ * we can count it).
+ */
+bool tb_stats_cmp(const void *ap, const void *bp)
+{
+const TBStatistics *a = ap;
+const TBStatistics *b = bp;
+
+return a->phys_pc == b->phys_pc &&
+a->pc == b->pc &&
+a->cs_base == b->cs_base &&
+a->flags == b->flags;
+}
+
 static bool tb_cmp(const void *ap, const void *bp)
 {
 const TranslationBlock *a = ap;
@@ -1137,6 +1154,9 @@ static void tb_htable_init(void)
 unsigned int mode = QHT_MODE_AUTO_RESIZE;
 
 qht_init(_ctx.htable, tb_cmp, CODE_GEN_HTABLE_SIZE, mode);
+if (tb_stats_collection_enabled()) {
+qht_init(_ctx.tb_stats, tb_stats_cmp, CODE_GEN_HTABLE_SIZE, mode);
+}
 }
 
 /* Must be called before using the QEMU cpus. 'tb_size' is the size
@@ -1666,6 +1686,32 @@ tb_link_page(TranslationBlock *tb, tb_page_addr_t 
phys_pc,
 return tb;
 }
 
+static TBStatistics *tb_get_stats(tb_page_addr_t phys_pc, target_ulong pc,
+  target_ulong cs_base, uint32_t flags)
+{
+TBStatistics *new_stats = g_new0(TBStatistics, 1);
+uint32_t hash = tb_stats_hash_func(phys_pc, pc, flags);
+void *existing_stats = NULL;
+new_stats->phys_pc = phys_pc;
+new_stats->pc = pc;
+new_stats->cs_base = cs_base;
+new_stats->flags = flags;
+
+qht_insert(_ctx.tb_stats, new_stats, hash, _stats);
+
+if (unlikely(existing_stats)) {
+/*
+ * If there is already a TBStatistic for this TB from a previous flush
+ * then just make the new TB point to the older TBStatistic
+ */
+g_free(new_stats);
+return existing_stats;
+} else {
+return new_stats;
+}
+}
+
+
 /* Called with mmap_lock held for user mode emulation.  */
 TranslationBlock *tb_gen_code(CPUState *cpu,
   target_ulong pc, target_ulong cs_base,
@@ -1732,6 +1778,17 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 ti = profile_getclock();
 #endif
 
+/*
+ * We want to fetch the stats structure before we start code
+ * generation so we can count interesting things about this
+ * generation.
+ */
+if (tb_stats_collection_enabled()) {
+tb->tb_stats = tb_get_stats(phys_pc, pc, cs_base, flags);
+} else {
+tb->tb_stats = NULL;
+}
+
 tcg_func_start(tcg_ctx);
 
 tcg_ctx->cpu = env_cpu(env);
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 16034ee651..24bd6a0a0c 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -22,21 +22,11 @@
 
 #include "exec/tb-context.h"
 #include "sysemu/cpus.h"
+#include "exec/tb-stats.h"
 
 /* allow to see translation results - the slowdown should be negligible, so we 
leave it */
 #define DEBUG_DISAS
 
-/* Page tracking code uses ram addresses in system mode, and virtual
-   addresses in userspace mode.  Define tb_page_addr_t to be an appropriate
-   type.  */
-#if defined(CONFIG_USER_ONLY)
-typedef abi_ulong tb_page_addr_t;
-#define TB_PAGE_ADDR_FMT TARGET_ABI_FMT_lx
-#else
-typedef ram_addr_t tb_page_addr_t;
-#define TB_PAGE_ADDR_FMT RAM_ADDR_FMT
-#endif
-
 #include "qemu/log.h"
 
 void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns);
@@ -403,6 +393,9 @@ struct TranslationBlock {
 uintptr_t jmp_list_head;
 uintptr_t jmp_list_next[2];
 uintptr_t jmp_dest[2];
+
+/* Pointer to a struct where statistics from the TB is stored */
+TBStatistics *tb_stats;
 };
 
 extern bool parallel_cpus;
diff --git a/include/exec/tb-context.h b/include/exec/tb-context.h
index feb585e0a7..3cfb62a338 100644
--- a/include/exec/tb-context.h
+++ b/include/exec/tb-context.h
@@ -23,6 +23,17 @@
 #include "qemu/thread.h"
 #include "qemu/qht.h"
 
+/* Page tracking code uses ram addresses 

[Qemu-devel] [PATCH v4 0/7] Measure Tiny Code Generation Quality

2019-07-19 Thread vandersonmr
This patch is part of Google Summer of Code (GSoC) 2019.
More about the project can be found in:
https://wiki.qemu.org/Internships/ProjectIdeas/TCGCodeQuality

The goal of this patch is to add infrastructure to collect
execution and JIT statistics during the emulation with accel/TCG.
The statistics are stored in TBStatistic structures (TBStats)
with each TB having its respective TBStats.

We added -d tb_stats and HMP tb_stats commands to allow the control
of this statistics collection. And info tb, tbs, and coverset commands
were also added to allow dumping and exploring all this information 
while emulating.

Collecting these statistics and information is useful to understand
qemu performance and to help to add the support for traces to QEMU. 

vandersonmr (7):
  accel: introducing TBStatistics structure
  accel: collecting TB execution count
  accel: collecting JIT statistics
  accel: replacing part of CONFIG_PROFILER with TBStats
  log: adding -d tb_stats to control tbstats
  monitor: adding tb_stats hmp command
  monitor: adding info tbs, tb, and coverset

 accel/tcg/Makefile.objs  |   2 +-
 accel/tcg/tb-stats.c | 489 +++
 accel/tcg/tcg-runtime.c  |   7 +
 accel/tcg/tcg-runtime.h  |   2 +
 accel/tcg/translate-all.c|  93 ++-
 accel/tcg/translator.c   |   6 +
 disas.c  | 108 
 hmp-commands-info.hx |  23 ++
 hmp-commands.hx  |  17 ++
 include/disas/disas.h|   1 +
 include/exec/exec-all.h  |  15 +-
 include/exec/gen-icount.h|   9 +
 include/exec/tb-context.h|  12 +
 include/exec/tb-hash.h   |   7 +
 include/exec/tb-stats.h  | 113 
 include/qemu/log-for-trace.h |   2 +
 include/qemu/log.h   |  16 ++
 linux-user/exit.c|   4 +
 monitor/misc.c   | 111 
 tcg/tcg.c| 114 +++-
 tcg/tcg.h|  12 +-
 util/log.c   |  99 ++-
 22 files changed, 1144 insertions(+), 118 deletions(-)
 create mode 100644 accel/tcg/tb-stats.c
 create mode 100644 include/exec/tb-stats.h

-- 
2.22.0




[Qemu-devel] [PATCH v4 3/7] accel: collecting JIT statistics

2019-07-19 Thread vandersonmr
If a TB has a TBS (TBStatistics) with the TB_JIT_STATS
enabled then we collect statistics of its translation
processes and code translation. To collect the number
of host instructions we used a modified version of the
disas function to pass through the whole code without
printing anything (fake_fprintf) but counting the number
of instructions.

Signed-off-by: vandersonmr 
---
 accel/tcg/translate-all.c |  18 +++
 accel/tcg/translator.c|   5 ++
 disas.c   | 108 ++
 include/disas/disas.h |   1 +
 include/exec/tb-stats.h   |  14 +
 include/qemu/log.h|   1 +
 tcg/tcg.c |  23 
 tcg/tcg.h |   2 +
 8 files changed, 172 insertions(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 7497dae508..3a47ac6f2c 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -1793,6 +1793,10 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 }
 }
 
+if (flag & TB_JIT_STATS) {
+tb->tb_stats->stats_enabled |= TB_JIT_STATS;
+atomic_inc(>tb_stats->translations.total);
+}
 } else {
 tb->tb_stats = NULL;
 }
@@ -1870,6 +1874,17 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 atomic_set(>search_out_len, prof->search_out_len + search_size);
 #endif
 
+if (tb_stats_enabled(tb, TB_JIT_STATS)) {
+size_t code_size = gen_code_size;
+if (tcg_ctx->data_gen_ptr) {
+code_size = tcg_ctx->data_gen_ptr - tb->tc.ptr;
+}
+
+atomic_add(>tb_stats->code.num_host_inst,
+get_num_insts(tb->tc.ptr, code_size));
+}
+
+
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_OUT_ASM) &&
 qemu_log_in_addr_range(tb->pc)) {
@@ -1927,6 +1942,9 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
 phys_page2 = -1;
 if ((pc & TARGET_PAGE_MASK) != virt_page2) {
 phys_page2 = get_page_addr_code(env, virt_page2);
+if (tb_stats_enabled(tb, TB_JIT_STATS)) {
+atomic_inc(>tb_stats->translations.spanning);
+}
 }
 /*
  * No explicit memory barrier is required -- tb_link_page() makes the
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 396a11e828..03c00bdb1b 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -117,6 +117,11 @@ void translator_loop(const TranslatorOps *ops, 
DisasContextBase *db,
 db->tb->size = db->pc_next - db->pc_first;
 db->tb->icount = db->num_insns;
 
+if (tb_stats_enabled(tb, TB_JIT_STATS)) {
+atomic_add(>tb->tb_stats->code.num_guest_inst, db->num_insns);
+}
+
+
 #ifdef DEBUG_DISAS
 if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
 && qemu_log_in_addr_range(db->pc_first)) {
diff --git a/disas.c b/disas.c
index 3e2bfa572b..5dec754992 100644
--- a/disas.c
+++ b/disas.c
@@ -475,6 +475,114 @@ void target_disas(FILE *out, CPUState *cpu, target_ulong 
code,
 }
 }
 
+static int fprintf_fake(struct _IO_FILE *a, const char *b, ...)
+{
+return 1;
+}
+
+/*
+ * This is a work around to get the number of host instructions with
+ * a small effort. It reuses the disas function with a fake printf to
+ * print nothing but count the number of instructions.
+ *
+ */
+unsigned get_num_insts(void *code, unsigned long size)
+{
+uintptr_t pc;
+int count;
+CPUDebug s;
+int (*print_insn)(bfd_vma pc, disassemble_info *info) = NULL;
+
+INIT_DISASSEMBLE_INFO(s.info, NULL, fprintf_fake);
+s.info.print_address_func = generic_print_host_address;
+
+s.info.buffer = code;
+s.info.buffer_vma = (uintptr_t)code;
+s.info.buffer_length = size;
+s.info.cap_arch = -1;
+s.info.cap_mode = 0;
+s.info.cap_insn_unit = 4;
+s.info.cap_insn_split = 4;
+
+#ifdef HOST_WORDS_BIGENDIAN
+s.info.endian = BFD_ENDIAN_BIG;
+#else
+s.info.endian = BFD_ENDIAN_LITTLE;
+#endif
+#if defined(CONFIG_TCG_INTERPRETER)
+print_insn = print_insn_tci;
+#elif defined(__i386__)
+s.info.mach = bfd_mach_i386_i386;
+print_insn = print_insn_i386;
+s.info.cap_arch = CS_ARCH_X86;
+s.info.cap_mode = CS_MODE_32;
+s.info.cap_insn_unit = 1;
+s.info.cap_insn_split = 8;
+#elif defined(__x86_64__)
+s.info.mach = bfd_mach_x86_64;
+print_insn = print_insn_i386;
+s.info.cap_arch = CS_ARCH_X86;
+s.info.cap_mode = CS_MODE_64;
+s.info.cap_insn_unit = 1;
+s.info.cap_insn_split = 8;
+#elif defined(_ARCH_PPC)
+s.info.disassembler_options = (char *)"any";
+print_insn = print_insn_ppc;
+s.info.cap_arch = CS_ARCH_PPC;
+# ifdef _ARCH_PPC64
+s.info.cap_mode = CS_MODE_64;
+# endif
+#elif defined(__riscv) && defined(CONFIG_RISCV_DIS)
+#if defined(_ILP32) || (__riscv_xlen == 32)
+print_insn = print_insn_riscv32;
+#elif defined(_LP64)
+print_insn = print_insn_riscv64;
+#else
+#error unsupported RISC-V ABI
+#endif
+#elif defined(__aarch64__) && defined(CONFIG_ARM_A64_DIS)
+print_insn 

[Qemu-devel] [PATCH v4 7/7] monitor: adding info tbs, tb, and coverset

2019-07-19 Thread vandersonmr
Adding info [tbs|tb|coverset] commands to HMP.
These commands allow the exploration of TBs
generated by the TCG. Understand which one
hotter, with more guest/host instructions...
and examine their guest, host and IR code.

The goal of this command is to allow the dynamic exploration
of TCG behavior and code quality. Therefore, for now, a
corresponding QMP command is not worthwhile.

Signed-off-by: vandersonmr 
---
 accel/tcg/tb-stats.c | 275 +++
 hmp-commands-info.hx |  23 +++
 include/exec/tb-stats.h  |  37 +
 include/qemu/log-for-trace.h |   2 +
 include/qemu/log.h   |   1 +
 linux-user/exit.c|   4 +
 monitor/misc.c   |  71 +
 util/log.c   |  26 +++-
 8 files changed, 431 insertions(+), 8 deletions(-)

diff --git a/accel/tcg/tb-stats.c b/accel/tcg/tb-stats.c
index 6c330e1b02..85a16cd563 100644
--- a/accel/tcg/tb-stats.c
+++ b/accel/tcg/tb-stats.c
@@ -212,3 +212,278 @@ void do_hmp_tbstats_safe(CPUState *cpu, run_on_cpu_data 
icmd)
 g_free(cmdinfo);
 }
 
+static void collect_tb_stats(void *p, uint32_t hash, void *userp)
+{
+last_search = g_list_prepend(last_search, p);
+}
+
+static void dump_tb_header(TBStatistics *tbs)
+{
+unsigned g = tbs->translations.total ?
+tbs->code.num_guest_inst / tbs->translations.total : 0;
+unsigned ops = tbs->translations.total ?
+tbs->code.num_tcg_ops / tbs->translations.total : 0;
+unsigned ops_opt = tbs->translations.total ?
+tbs->code.num_tcg_ops_opt / tbs->translations.total : 0;
+unsigned h = tbs->translations.total ?
+tbs->code.num_host_inst / tbs->translations.total : 0;
+unsigned spills = tbs->translations.total ?
+tbs->code.spills / tbs->translations.total : 0;
+
+float guest_host_prop = g ? ((float) h / g) : 0;
+
+qemu_log("TB%d: phys:0x"TB_PAGE_ADDR_FMT" virt:0x"TARGET_FMT_lx
+ " flags:%#08x (trans:%lu uncached:%lu exec:%lu ints: g:%u op:%u 
op_opt:%u h:%u h/g:%.2f spills:%d)\n",
+ tbs->display_id,
+ tbs->phys_pc, tbs->pc, tbs->flags,
+ tbs->translations.total, tbs->translations.uncached,
+ tbs->executions.total, g, ops, ops_opt, h, guest_host_prop,
+ spills);
+}
+
+static gint
+inverse_sort_tbs(gconstpointer p1, gconstpointer p2, gpointer psort_by)
+{
+const TBStatistics *tbs1 = (TBStatistics *) p1;
+const TBStatistics *tbs2 = (TBStatistics *) p2;
+int sort_by = *((int *) psort_by);
+unsigned long c1 = 0;
+unsigned long c2 = 0;
+
+if (likely(sort_by == SORT_BY_SPILLS)) {
+c1 = tbs1->code.spills;
+c2 = tbs2->code.spills;
+} else if (likely(sort_by == SORT_BY_HOTNESS)) {
+c1 = tbs1->executions.total;
+c2 = tbs2->executions.total;
+} else if (likely(sort_by == SORT_BY_HG)) {
+if (tbs1->code.num_guest_inst == 0) {
+return -1;
+}
+if (tbs2->code.num_guest_inst == 0) {
+return 1;
+}
+
+float a = (float) tbs1->code.num_host_inst / tbs1->code.num_guest_inst;
+float b = (float) tbs2->code.num_host_inst / tbs2->code.num_guest_inst;
+c1 = a <= b ? 0 : 1;
+c2 = a <= b ? 1 : 0;
+}
+
+return c1 < c2 ? 1 : c1 == c2 ? 0 : -1;
+}
+
+static void do_dump_coverset_info(int percentage)
+{
+uint64_t total_exec_count = 0;
+uint64_t covered_exec_count = 0;
+unsigned coverset_size = 0;
+int id = 1;
+GList *i;
+
+g_list_free(last_search);
+last_search = NULL;
+
+qht_iter(_ctx.tb_stats, collect_tb_stats, NULL);
+
+int sort_by = SORT_BY_HOTNESS;
+last_search = g_list_sort_with_data(last_search, inverse_sort_tbs, 
_by);
+
+if (!last_search) {
+qemu_log("No data collected yet\n");
+return;
+}
+
+/* Compute total execution count for all tbs */
+for (i = last_search; i; i = i->next) {
+TBStatistics *tbs = (TBStatistics *) i->data;
+total_exec_count += tbs->executions.total * tbs->code.num_guest_inst;
+}
+
+for (i = last_search; i; i = i->next) {
+TBStatistics *tbs = (TBStatistics *) i->data;
+covered_exec_count += tbs->executions.total * tbs->code.num_guest_inst;
+tbs->display_id = id++;
+coverset_size++;
+dump_tb_header(tbs);
+
+/* Iterate and display tbs until reach the percentage count cover */
+if (((double) covered_exec_count / total_exec_count) >
+((double) percentage / 100)) {
+break;
+}
+}
+
+qemu_log("\n--\n");
+qemu_log("# of TBs to reach %d%% of the total of guest insts exec: %u\t",
+percentage, coverset_size);
+qemu_log("Total of guest insts exec: %lu\n", total_exec_count);
+qemu_log("\n--\n");
+
+/* free the unused bits */
+if (i) {
+if (i->next) {
+i->next->prev = 

Re: [Qemu-devel] [PATCH v6 00/14] Add support for io_uring

2019-07-19 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190719133530.28688-1-mehta.aar...@gmail.com/



Hi,

This series failed build test on s390x host. Please find the details below.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e

echo
echo "=== ENV ==="
env

echo
echo "=== PACKAGES ==="
rpm -qa

echo
echo "=== UNAME ==="
uname -a

CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install
=== TEST SCRIPT END ===

  CC  nbd/trace.o
In file included from block/trace.c:4:
block/trace.h: In function ‘_nocheck__trace_luring_resubmit_short_read’:
block/trace.h:1704:96: error: stray ‘\’ in program
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  | 
   ^
block/trace.h:1704:60: error: expected ‘)’ before ‘n’
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  |^
~
  |)
block/trace.h:1704:98: error: missing terminating " character [-Werror]
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  | 
 ^
block/trace.h:1704:98: error: missing terminating " character
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  | 
 ^~
block/trace.h:1704:20: error: format ‘%d’ expects a matching ‘int’ argument 
[-Werror=format=]
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  |   ~^
  ||
  |int
block/trace.h:1704:24: error: format ‘%zu’ expects a matching ‘size_t’ argument 
[-Werror=format=]
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  |  ~~^
  ||
  |long unsigned int
block/trace.h:1704:30: error: format ‘%zu’ expects a matching ‘size_t’ argument 
[-Werror=format=]
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  |  ^
  |  |
  |  long unsigned int
block/trace.h:1704:18: error: format ‘%p’ expects a matching ‘void *’ argument 
[-Werror=format=]
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  |  ^~
block/trace.h:1704:75: note: format string is defined here
---
  | 
 ~^
  | 
  |
  | 
  void *
block/trace.h:1704:18: error: format ‘%p’ expects a matching ‘void *’ argument 
[-Werror=format=]
 1704 | qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " 
"LuringState %p luringcb %p nread "\n",
  |  ^~
block/trace.h:1704:87: note: format string is defined here


The full log is available at
http://patchew.org/logs/20190719133530.28688-1-mehta.aar...@gmail.com/testing.s390x/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH v6 00/14] Add support for io_uring

2019-07-19 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20190719133530.28688-1-mehta.aar...@gmail.com/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC  hw/arm/trace.o
  CC  hw/audio/trace.o
In file included from block/trace.c:4:
/tmp/qemu-test/build/block/trace.h:1704:96: error: expected ')'
qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " "LuringState %p 
luringcb %p nread "\n",

   ^
/tmp/qemu-test/build/block/trace.h:1704:17: note: to match this '('
qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " "LuringState %p 
luringcb %p nread "\n",
^
/tmp/qemu-test/build/block/trace.h:1704:98: error: missing terminating '"' 
character [-Werror,-Winvalid-pp-token]
qemu_log("%d@%zu.%06zu:luring_resubmit_short_read " "LuringState %p 
luringcb %p nread "\n",

 ^
  CC  hw/block/trace.o


The full log is available at
http://patchew.org/logs/20190719133530.28688-1-mehta.aar...@gmail.com/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [Qemu-devel] [PATCH 1/1] x86: add CPU flags supported inside libvirt

2019-07-19 Thread Paolo Bonzini
On 20/07/19 00:05, Eduardo Habkost wrote:
>> Actually KVM does not mark it as supported:
>>
>> const u32 kvm_cpuid_7_0_ebx_x86_features =
>> F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
>> F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
>> F(ADX) | F(SMAP) | F(AVX512IFMA) | F(AVX512F) | F(AVX512PF) |
>> F(AVX512ER) | F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | 
>> F(AVX512DQ) |
>> F(SHA_NI) | F(AVX512BW) | F(AVX512VL) | f_intel_pt;
> We can still add the feature name if we also set it on
> .unmigratable_features, but I don't see why it would be useful.
> Is anybody working to support exposing RDT-M to guests?

I don't think so.

Paolo



Re: [Qemu-devel] [PATCH 1/1] x86: add CPU flags supported inside libvirt

2019-07-19 Thread Eduardo Habkost
On Fri, Jul 19, 2019 at 11:44:57PM +0200, Paolo Bonzini wrote:
> On 19/07/19 22:53, Eduardo Habkost wrote:
> > This is one is named "cqm" on Linux (X86_FEATURE_CQM).  I prefer
> > to keep consistency with the name already in use by Linux than
> > the one in libvirt that was never used.
> > 
> > You can still add a "cmt" alias property if you think it would be
> > useful.
> 
> Actually KVM does not mark it as supported:
> 
> const u32 kvm_cpuid_7_0_ebx_x86_features =
> F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
> F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
> F(ADX) | F(SMAP) | F(AVX512IFMA) | F(AVX512F) | F(AVX512PF) |
> F(AVX512ER) | F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | 
> F(AVX512DQ) |
> F(SHA_NI) | F(AVX512BW) | F(AVX512VL) | f_intel_pt;

We can still add the feature name if we also set it on
.unmigratable_features, but I don't see why it would be useful.
Is anybody working to support exposing RDT-M to guests?

-- 
Eduardo



[Qemu-devel] [PATCH for-4.2 24/24] target/arm: Enable ARMv8.1-VHE in -cpu max

2019-07-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/cpu64.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/cpu64.c b/target/arm/cpu64.c
index 1901997a06..b1bb394c6d 100644
--- a/target/arm/cpu64.c
+++ b/target/arm/cpu64.c
@@ -337,6 +337,7 @@ static void aarch64_max_initfn(Object *obj)
 t = cpu->isar.id_aa64mmfr1;
 t = FIELD_DP64(t, ID_AA64MMFR1, HPDS, 1); /* HPD */
 t = FIELD_DP64(t, ID_AA64MMFR1, LO, 1);
+t = FIELD_DP64(t, ID_AA64MMFR1, VH, 1);
 cpu->isar.id_aa64mmfr1 = t;
 
 /* Replicate the same data to the 32-bit id registers.  */
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 23/24] target/arm: Update {fp, sve}_exception_el for VHE

2019-07-19 Thread Richard Henderson
When TGE+E2H are both set, CPACR_EL1 is ignored.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 53 -
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index ae3ec9ea67..bbe36eb3a9 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5501,7 +5501,9 @@ static const ARMCPRegInfo debug_lpae_cp_reginfo[] = {
 int sve_exception_el(CPUARMState *env, int el)
 {
 #ifndef CONFIG_USER_ONLY
-if (el <= 1) {
+uint64_t hcr_el2 = arm_hcr_el2_eff(env);
+
+if (el <= 1 && (hcr_el2 & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
 bool disabled = false;
 
 /* The CPACR.ZEN controls traps to EL1:
@@ -5516,8 +5518,7 @@ int sve_exception_el(CPUARMState *env, int el)
 }
 if (disabled) {
 /* route_to_el2 */
-return (arm_feature(env, ARM_FEATURE_EL2)
-&& (arm_hcr_el2_eff(env) & HCR_TGE) ? 2 : 1);
+return hcr_el2 & HCR_TGE ? 2 : 1;
 }
 
 /* Check CPACR.FPEN.  */
@@ -11212,8 +11213,6 @@ uint32_t HELPER(crc32c)(uint32_t acc, uint32_t val, 
uint32_t bytes)
 int fp_exception_el(CPUARMState *env, int cur_el)
 {
 #ifndef CONFIG_USER_ONLY
-int fpen;
-
 /* CPACR and the CPTR registers don't exist before v6, so FP is
  * always accessible
  */
@@ -11241,30 +11240,34 @@ int fp_exception_el(CPUARMState *env, int cur_el)
  * 0, 2 : trap EL0 and EL1/PL1 accesses
  * 1: trap only EL0 accesses
  * 3: trap no accesses
+ * This register is ignored if E2H+TGE are both set.
  */
-fpen = extract32(env->cp15.cpacr_el1, 20, 2);
-switch (fpen) {
-case 0:
-case 2:
-if (cur_el == 0 || cur_el == 1) {
-/* Trap to PL1, which might be EL1 or EL3 */
-if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
+if ((arm_hcr_el2_eff(env) & (HCR_E2H | HCR_TGE)) != (HCR_E2H | HCR_TGE)) {
+int fpen = extract32(env->cp15.cpacr_el1, 20, 2);
+
+switch (fpen) {
+case 0:
+case 2:
+if (cur_el == 0 || cur_el == 1) {
+/* Trap to PL1, which might be EL1 or EL3 */
+if (arm_is_secure(env) && !arm_el_is_aa64(env, 3)) {
+return 3;
+}
+return 1;
+}
+if (cur_el == 3 && !is_a64(env)) {
+/* Secure PL1 running at EL3 */
 return 3;
 }
-return 1;
+break;
+case 1:
+if (cur_el == 0) {
+return 1;
+}
+break;
+case 3:
+break;
 }
-if (cur_el == 3 && !is_a64(env)) {
-/* Secure PL1 running at EL3 */
-return 3;
-}
-break;
-case 1:
-if (cur_el == 0) {
-return 1;
-}
-break;
-case 3:
-break;
 }
 
 /*
-- 
2.17.1




Re: [Qemu-devel] [PATCH 1/1] x86: add CPU flags supported inside libvirt

2019-07-19 Thread Paolo Bonzini
On 19/07/19 22:53, Eduardo Habkost wrote:
> This is one is named "cqm" on Linux (X86_FEATURE_CQM).  I prefer
> to keep consistency with the name already in use by Linux than
> the one in libvirt that was never used.
> 
> You can still add a "cmt" alias property if you think it would be
> useful.

Actually KVM does not mark it as supported:

const u32 kvm_cpuid_7_0_ebx_x86_features =
F(FSGSBASE) | F(BMI1) | F(HLE) | F(AVX2) | F(SMEP) |
F(BMI2) | F(ERMS) | f_invpcid | F(RTM) | f_mpx | F(RDSEED) |
F(ADX) | F(SMAP) | F(AVX512IFMA) | F(AVX512F) | F(AVX512PF) |
F(AVX512ER) | F(AVX512CD) | F(CLFLUSHOPT) | F(CLWB) | 
F(AVX512DQ) |
F(SHA_NI) | F(AVX512BW) | F(AVX512VL) | f_intel_pt;

Paolo



[Qemu-devel] [PATCH for-4.2 13/24] target/arm: Split out vae1_tlbmask, vmalle1_tlbmask

2019-07-19 Thread Richard Henderson
No functional change, but unify code sequences.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 118 ++--
 1 file changed, 37 insertions(+), 81 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 9a9809ff4f..7adbf51479 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3901,70 +3901,61 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
  * Page D4-1736 (DDI0487A.b)
  */
 
+static int vae1_tlbmask(CPUARMState *env)
+{
+if (arm_is_secure_below_el3(env)) {
+return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+} else {
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
+}
+}
+
 static void tlbi_aa64_vmalle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
 CPUState *cs = env_cpu(env);
-bool sec = arm_is_secure_below_el3(env);
+int mask = vae1_tlbmask(env);
 
-if (sec) {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
-}
+tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
 }
 
 static void tlbi_aa64_vmalle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
 CPUState *cs = env_cpu(env);
+int mask = vae1_tlbmask(env);
 
 if (tlb_force_broadcast(env)) {
 tlbi_aa64_vmalle1is_write(env, NULL, value);
 return;
 }
 
+tlb_flush_by_mmuidx(cs, mask);
+}
+
+static int vmalle1_tlbmask(CPUARMState *env)
+{
+/*
+ * Note that the 'ALL' scope must invalidate both stage 1 and
+ * stage 2 translations, whereas most other scopes only invalidate
+ * stage 1 translations.
+ */
 if (arm_is_secure_below_el3(env)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
+return ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+} else if (arm_feature(env, ARM_FEATURE_EL2)) {
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0 | ARMMMUIdxBit_S2NS;
 } else {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
+return ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
 }
 }
 
 static void tlbi_aa64_alle1_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
-/* Note that the 'ALL' scope must invalidate both stage 1 and
- * stage 2 translations, whereas most other scopes only invalidate
- * stage 1 translations.
- */
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int mask = vmalle1_tlbmask(env);
 
-if (arm_is_secure_below_el3(env)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else {
-if (arm_feature(env, ARM_FEATURE_EL2)) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
-ARMMMUIdxBit_S2NS);
-} else {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0);
-}
-}
+tlb_flush_by_mmuidx(cs, mask);
 }
 
 static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3988,28 +3979,10 @@ static void tlbi_aa64_alle3_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 static void tlbi_aa64_alle1is_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
-/* Note that the 'ALL' scope must invalidate both stage 1 and
- * stage 2 translations, whereas most other scopes only invalidate
- * stage 1 translations.
- */
 CPUState *cs = env_cpu(env);
-bool sec = arm_is_secure_below_el3(env);
-bool has_el2 = arm_feature(env, ARM_FEATURE_EL2);
+int mask = vmalle1_tlbmask(env);
 
-if (sec) {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S1SE1 |
-ARMMMUIdxBit_S1SE0);
-} else if (has_el2) {
-tlb_flush_by_mmuidx_all_cpus_synced(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
-ARMMMUIdxBit_S2NS);
-} else {
-  tlb_flush_by_mmuidx_all_cpus_synced(cs,
-  

[Qemu-devel] [PATCH for-4.2 20/24] target/arm: Flush tlbs for E2&0 translation regime

2019-07-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 22eb056b27..fe022f51d6 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3915,8 +3915,11 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
 
 static int vae1_tlbmask(CPUARMState *env)
 {
+/* Since we exclude secure first, we may read HCR_EL2 directly. */
 if (arm_is_secure_below_el3(env)) {
 return ARMMMUIdxBit_SE1 | ARMMMUIdxBit_SE0;
+} else if (env->cp15.hcr_el2 & HCR_E2H) {
+return ARMMMUIdxBit_E2 | ARMMMUIdxBit_E0;
 } else {
 return ARMMMUIdxBit_E1 | ARMMMUIdxBit_E0;
 }
@@ -3954,7 +3957,12 @@ static int vmalle1_tlbmask(CPUARMState *env)
 if (arm_is_secure_below_el3(env)) {
 return ARMMMUIdxBit_SE1 | ARMMMUIdxBit_SE0;
 } else if (arm_feature(env, ARM_FEATURE_EL2)) {
-return ARMMMUIdxBit_E1 | ARMMMUIdxBit_E0 | ARMMMUIdxBit_Stage2;
+/* Since we exclude secure first, we may read HCR_EL2 directly. */
+if (env->cp15.hcr_el2 & HCR_E2H) {
+return ARMMMUIdxBit_E2 | ARMMMUIdxBit_E0;
+} else {
+return ARMMMUIdxBit_E1 | ARMMMUIdxBit_E0 | ARMMMUIdxBit_Stage2;
+}
 } else {
 return ARMMMUIdxBit_E1 | ARMMMUIdxBit_E0;
 }
@@ -3969,13 +3977,22 @@ static void tlbi_aa64_alle1_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 tlb_flush_by_mmuidx(cs, mask);
 }
 
+static int vae2_tlbmask(CPUARMState *env)
+{
+if (arm_hcr_el2_eff(env) & HCR_E2H) {
+return ARMMMUIdxBit_E0 | ARMMMUIdxBit_E2;
+} else {
+return ARMMMUIdxBit_E2;
+}
+}
+
 static void tlbi_aa64_alle2_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int mask = vae2_tlbmask(env);
 
-tlb_flush_by_mmuidx(cs, ARMMMUIdxBit_E2);
+tlb_flush_by_mmuidx(cs, mask);
 }
 
 static void tlbi_aa64_alle3_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4000,8 +4017,9 @@ static void tlbi_aa64_alle2is_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 uint64_t value)
 {
 CPUState *cs = env_cpu(env);
+int mask = vae2_tlbmask(env);
 
-tlb_flush_by_mmuidx_all_cpus_synced(cs, ARMMMUIdxBit_E2);
+tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
 }
 
 static void tlbi_aa64_alle3is_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -4019,11 +4037,11 @@ static void tlbi_aa64_vae2_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
  * Currently handles both VAE2 and VALE2, since we don't support
  * flush-last-level-only.
  */
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int mask = vae2_tlbmask(env);
 uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
-tlb_flush_page_by_mmuidx(cs, pageaddr, ARMMMUIdxBit_E2);
+tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
 }
 
 static void tlbi_aa64_vae3_write(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.17.1




Re: [Qemu-devel] [PATCH v4 20/22] target/arm: Create a TLB entry for tag physical address space

2019-07-19 Thread Richard Henderson
On 7/19/19 8:48 AM, Peter Maydell wrote:
> Playing around with this series, I have discovered that if
> the board model doesn't create the tag-memory then target/arm/cpu.c
> will not create the 'cpu-tag-memory' AddressSpace. But nothing
> disables the usage of the target_tlb_bit2, and then when
> arm_cpu_tlb_fill() does a tlb_set_page_with_attrs() using
> an attrs with target_tlb_bit2 set then we assert in
> cpu_asidx_from_attrs() because cpu->num_ases is 2 and
> cc->asidx_from_attrs() returned an out of range number (ie 2).

Oops.

> Is the tag-memory mandatory for MTE? If so we should either
> disable MTE if no tag-memory is provided, or else fail
> realize of the CPU; not sure which. If it's not mandatory
> then we need to avoid asserting :-)

I'm not sure.  I'll need to study the docs again.

There is an MTE support level at which some of the EL0 bits are recognized but
no tags are supported: ID_AA64PFR0_EL1.MTE == 1.  But that's not quite the same
as what you're asking.


r~



[Qemu-devel] [PATCH for-4.2 06/24] target/arm: Define isar_feature_aa64_vh

2019-07-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 94c990cddb..e6a76d14c6 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3573,6 +3573,11 @@ static inline bool isar_feature_aa64_sve(const 
ARMISARegisters *id)
 return FIELD_EX64(id->id_aa64pfr0, ID_AA64PFR0, SVE) != 0;
 }
 
+static inline bool isar_feature_aa64_vh(const ARMISARegisters *id)
+{
+return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, VH) != 0;
+}
+
 static inline bool isar_feature_aa64_lor(const ARMISARegisters *id)
 {
 return FIELD_EX64(id->id_aa64mmfr1, ID_AA64MMFR1, LO) != 0;
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 16/24] target/arm: Add regime_has_2_ranges

2019-07-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/internals.h | 16 
 target/arm/helper.c| 22 +-
 target/arm/translate-a64.c |  3 +--
 3 files changed, 22 insertions(+), 19 deletions(-)

diff --git a/target/arm/internals.h b/target/arm/internals.h
index 047cbfd1d7..f653e0e7f5 100644
--- a/target/arm/internals.h
+++ b/target/arm/internals.h
@@ -824,6 +824,22 @@ static inline void arm_call_el_change_hook(ARMCPU *cpu)
 }
 }
 
+/* Return true if this address translation regime has two ranges.  */
+static inline bool regime_has_2_ranges(ARMMMUIdx mmu_idx)
+{
+switch (mmu_idx) {
+case ARMMMUIdx_Stage1_E0:
+case ARMMMUIdx_Stage1_E1:
+case ARMMMUIdx_EL10_0:
+case ARMMMUIdx_EL10_1:
+case ARMMMUIdx_EL20_0:
+case ARMMMUIdx_EL20_2:
+return true;
+default:
+return false;
+}
+}
+
 /* Return true if this address translation regime is secure */
 static inline bool regime_is_secure(CPUARMState *env, ARMMMUIdx mmu_idx)
 {
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 240a6ed168..2d5658f9e3 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8999,15 +8999,8 @@ static int get_S1prot(CPUARMState *env, ARMMMUIdx 
mmu_idx, bool is_aa64,
 }
 
 if (is_aa64) {
-switch (regime_el(env, mmu_idx)) {
-case 1:
-if (!is_user) {
-xn = pxn || (user_rw & PAGE_WRITE);
-}
-break;
-case 2:
-case 3:
-break;
+if (regime_has_2_ranges(mmu_idx) && !is_user) {
+xn = pxn || (user_rw & PAGE_WRITE);
 }
 } else if (arm_feature(env, ARM_FEATURE_V7)) {
 switch (regime_el(env, mmu_idx)) {
@@ -9541,7 +9534,6 @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, 
uint64_t va,
 ARMMMUIdx mmu_idx)
 {
 uint64_t tcr = regime_tcr(env, mmu_idx)->raw_tcr;
-uint32_t el = regime_el(env, mmu_idx);
 bool tbi, tbid, epd, hpd, using16k, using64k;
 int select, tsz;
 
@@ -9551,7 +9543,7 @@ ARMVAParameters aa64_va_parameters_both(CPUARMState *env, 
uint64_t va,
  */
 select = extract64(va, 55, 1);
 
-if (el > 1) {
+if (!regime_has_2_ranges(mmu_idx)) {
 tsz = extract32(tcr, 0, 6);
 using64k = extract32(tcr, 14, 1);
 using16k = extract32(tcr, 15, 1);
@@ -9707,10 +9699,7 @@ static bool get_phys_addr_lpae(CPUARMState *env, 
target_ulong address,
 param = aa64_va_parameters(env, address, mmu_idx,
access_type != MMU_INST_FETCH);
 level = 0;
-/* If we are in 64-bit EL2 or EL3 then there is no TTBR1, so mark it
- * invalid.
- */
-ttbr1_valid = (el < 2);
+ttbr1_valid = regime_has_2_ranges(mmu_idx);
 addrsize = 64 - 8 * param.tbi;
 inputsize = 64 - param.tsz;
 } else {
@@ -11361,8 +11350,7 @@ void cpu_get_tb_cpu_state(CPUARMState *env, 
target_ulong *pc,
 ARMVAParameters p0 = aa64_va_parameters_both(env, 0, stage1);
 int tbii, tbid;
 
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */
-if (regime_el(env, stage1) < 2) {
+if (regime_has_2_ranges(mmu_idx)) {
 ARMVAParameters p1 = aa64_va_parameters_both(env, -1, stage1);
 tbid = (p1.tbi << 1) | p0.tbi;
 tbii = tbid & ~((p1.tbid << 1) | p0.tbid);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index a9e65fe347..5679349310 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -176,8 +176,7 @@ static void gen_top_byte_ignore(DisasContext *s, TCGv_i64 
dst,
 if (tbi == 0) {
 /* Load unmodified address */
 tcg_gen_mov_i64(dst, src);
-} else if (s->current_el >= 2) {
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */
+} else if (!regime_has_2_ranges(s->mmu_idx)) {
 /* Force tag byte to all zero */
 tcg_gen_extract_i64(dst, src, 0, 56);
 } else {
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 08/24] target/arm: Add CONTEXTIDR_EL2

2019-07-19 Thread Richard Henderson
Not all of the breakpoint types are supported, but those that
only examine contextidr are extended to support the new register.

Signed-off-by: Richard Henderson 
---
 target/arm/debug_helper.c | 50 +--
 target/arm/helper.c   | 11 +
 2 files changed, 49 insertions(+), 12 deletions(-)

diff --git a/target/arm/debug_helper.c b/target/arm/debug_helper.c
index dde80273ff..2e3e90c6a5 100644
--- a/target/arm/debug_helper.c
+++ b/target/arm/debug_helper.c
@@ -20,6 +20,7 @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
 int ctx_cmps = extract32(cpu->dbgdidr, 20, 4);
 int bt;
 uint32_t contextidr;
+uint64_t hcr_el2;
 
 /*
  * Links to unimplemented or non-context aware breakpoints are
@@ -40,24 +41,44 @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
 }
 
 bt = extract64(bcr, 20, 4);
-
-/*
- * We match the whole register even if this is AArch32 using the
- * short descriptor format (in which case it holds both PROCID and ASID),
- * since we don't implement the optional v7 context ID masking.
- */
-contextidr = extract64(env->cp15.contextidr_el[1], 0, 32);
+hcr_el2 = arm_hcr_el2_eff(env);
 
 switch (bt) {
 case 3: /* linked context ID match */
-if (arm_current_el(env) > 1) {
-/* Context matches never fire in EL2 or (AArch64) EL3 */
+switch (arm_current_el(env)) {
+default:
+/* Context matches never fire in AArch64 EL3 */
 return false;
+case 2:
+if (!(hcr_el2 & HCR_E2H)) {
+/* Context matches never fire in EL2 without E2H enabled. */
+return false;
+}
+contextidr = env->cp15.contextidr_el[2];
+break;
+case 1:
+contextidr = env->cp15.contextidr_el[1];
+break;
+case 0:
+if ((hcr_el2 & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+contextidr = env->cp15.contextidr_el[2];
+} else {
+contextidr = env->cp15.contextidr_el[1];
+}
+break;
 }
-return (contextidr == extract64(env->cp15.dbgbvr[lbn], 0, 32));
-case 5: /* linked address mismatch (reserved in AArch64) */
+break;
+
+case 7:  /* linked contextidr_el1 match */
+contextidr = env->cp15.contextidr_el[1];
+break;
+case 13: /* linked contextidr_el2 match */
+contextidr = env->cp15.contextidr_el[2];
+break;
+
 case 9: /* linked VMID match (reserved if no EL2) */
 case 11: /* linked context ID and VMID match (reserved if no EL2) */
+case 15: /* linked full context ID match */
 default:
 /*
  * Links to Unlinked context breakpoints must generate no
@@ -66,7 +87,12 @@ static bool linked_bp_matches(ARMCPU *cpu, int lbn)
 return false;
 }
 
-return false;
+/*
+ * We match the whole register even if this is AArch32 using the
+ * short descriptor format (in which case it holds both PROCID and ASID),
+ * since we don't implement the optional v7 context ID masking.
+ */
+return contextidr == (uint32_t)env->cp15.dbgbvr[lbn];
 }
 
 static bool bp_wp_matches(ARMCPU *cpu, int n, bool is_wp)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0a55096770..d1bf31ab74 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -6801,6 +6801,17 @@ void register_cp_regs_for_features(ARMCPU *cpu)
 define_arm_cp_regs(cpu, lor_reginfo);
 }
 
+if (arm_feature(env, ARM_FEATURE_EL2) && cpu_isar_feature(aa64_vh, cpu)) {
+static const ARMCPRegInfo vhe_reginfo[] = {
+{ .name = "CONTEXTIDR_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
+  .access = PL2_RW,
+  .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2]) },
+REGINFO_SENTINEL
+};
+define_arm_cp_regs(cpu, vhe_reginfo);
+}
+
 if (cpu_isar_feature(aa64_sve, cpu)) {
 define_one_arm_cp_reg(cpu, _el1_reginfo);
 if (arm_feature(env, ARM_FEATURE_EL2)) {
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 12/24] target/arm: Add VHE system register redirection and aliasing

2019-07-19 Thread Richard Henderson
Several of the EL1/0 registers are redirected to the EL2 version when in
EL2 and HCR_EL2.E2H is set.  Many of these registers have side effects.
Link together the two ARMCPRegInfo structures after they have been
properly instantiated.  Install common dispatch routines to all of the
relevant registers.

The same set of registers that are redirected also have additional
EL12/EL02 aliases created to access the original register that was
redirected.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h|  44 
 target/arm/helper.c | 171 
 2 files changed, 202 insertions(+), 13 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index bba4e1f984..a0f10b60eb 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2455,19 +2455,6 @@ struct ARMCPRegInfo {
  */
 ptrdiff_t fieldoffset; /* offsetof(CPUARMState, field) */
 
-/* Offsets of the secure and non-secure fields in CPUARMState for the
- * register if it is banked.  These fields are only used during the static
- * registration of a register.  During hashing the bank associated
- * with a given security state is copied to fieldoffset which is used from
- * there on out.
- *
- * It is expected that register definitions use either fieldoffset or
- * bank_fieldoffsets in the definition but not both.  It is also expected
- * that both bank offsets are set when defining a banked register.  This
- * use indicates that a register is banked.
- */
-ptrdiff_t bank_fieldoffsets[2];
-
 /* Function for making any access checks for this register in addition to
  * those specified by the 'access' permissions bits. If NULL, no extra
  * checks required. The access check is performed at runtime, not at
@@ -2502,6 +2489,37 @@ struct ARMCPRegInfo {
  * fieldoffset is 0 then no reset will be done.
  */
 CPResetFn *resetfn;
+
+union {
+/*
+ * Offsets of the secure and non-secure fields in CPUARMState for
+ * the register if it is banked.  These fields are only used during
+ * the static registration of a register.  During hashing the bank
+ * associated with a given security state is copied to fieldoffset
+ * which is used from there on out.
+ *
+ * It is expected that register definitions use either fieldoffset
+ * or bank_fieldoffsets in the definition but not both.  It is also
+ * expected that both bank offsets are set when defining a banked
+ * register.  This use indicates that a register is banked.
+ */
+ptrdiff_t bank_fieldoffsets[2];
+
+/*
+ * "Original" writefn and readfn.
+ * For ARMv8.1-VHE register aliases, we overwrite the read/write
+ * accessor functions of various EL1/EL0 to perform the runtime
+ * check for which sysreg should actually be modified, and then
+ * forwards the operation.  Before overwriting the accessors,
+ * the original function is copied here, so that accesses that
+ * really do go to the EL1/EL0 version proceed normally.
+ * (The corresponding EL2 register is linked via opaque.)
+ */
+struct {
+CPReadFn *orig_readfn;
+CPWriteFn *orig_writefn;
+};
+};
 };
 
 /* Macros which are lvalues for the field in CPUARMState for the
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 329548e45d..9a9809ff4f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5228,6 +5228,169 @@ static const ARMCPRegInfo el3_cp_reginfo[] = {
 REGINFO_SENTINEL
 };
 
+/* Test if system register redirection is to occur in the current state.  */
+static bool redirect_for_e2h(CPUARMState *env)
+{
+return arm_current_el(env) == 2 && (arm_hcr_el2_eff(env) & HCR_E2H);
+}
+
+static uint64_t el2_e2h_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+CPReadFn *readfn;
+
+if (redirect_for_e2h(env)) {
+/* Switch to the saved EL2 version of the register.  */
+ri = ri->opaque;
+readfn = ri->readfn;
+} else {
+readfn = ri->orig_readfn;
+}
+if (readfn == NULL) {
+readfn = raw_read;
+}
+return readfn(env, ri);
+}
+
+static void el2_e2h_write(CPUARMState *env, const ARMCPRegInfo *ri,
+  uint64_t value)
+{
+CPWriteFn *writefn;
+
+if (redirect_for_e2h(env)) {
+/* Switch to the saved EL2 version of the register.  */
+ri = ri->opaque;
+writefn = ri->writefn;
+} else {
+writefn = ri->orig_writefn;
+}
+if (writefn == NULL) {
+writefn = raw_write;
+}
+writefn(env, ri, value);
+}
+
+static void define_arm_vh_e2h_redirects_aliases(ARMCPU *cpu)
+{
+struct E2HAlias {
+uint32_t src_key, dst_key, new_key;
+const char *src_name, *dst_name, *new_name;
+bool (*feature)(const ARMISARegisters *id);
+};
+
+#define 

[Qemu-devel] [PATCH for-4.2 09/24] target/arm: Add TTBR1_EL2

2019-07-19 Thread Richard Henderson
At the same time, add writefn to TTBR0_EL2 and TCR_EL2.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 21 -
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index d1bf31ab74..da2e0627b2 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3449,6 +3449,15 @@ static void vmsa_ttbr_el1_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 }
 }
 
+static void vmsa_tcr_ttbr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+raw_write(env, ri, value);
+if (arm_hcr_el2_eff(env) & HCR_E2H) {
+/* The ASID field is active.  */
+}
+}
+
 static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
@@ -4844,10 +4853,8 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
   .resetvalue = 0 },
 { .name = "TCR_EL2", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 2,
-  .access = PL2_RW,
-  /* no .writefn needed as this can't cause an ASID change;
-   * no .raw_writefn or .resetfn needed as we never use mask/base_mask
-   */
+  .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
+  /* no .raw_writefn or .resetfn needed as we never use mask/base_mask */
   .fieldoffset = offsetof(CPUARMState, cp15.tcr_el[2]) },
 { .name = "VTCR", .state = ARM_CP_STATE_AA32,
   .cp = 15, .opc1 = 4, .crn = 2, .crm = 1, .opc2 = 2,
@@ -4881,7 +4888,7 @@ static const ARMCPRegInfo el2_cp_reginfo[] = {
   .fieldoffset = offsetof(CPUARMState, cp15.tpidr_el[2]) },
 { .name = "TTBR0_EL2", .state = ARM_CP_STATE_AA64,
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 0,
-  .access = PL2_RW, .resetvalue = 0,
+  .access = PL2_RW, .resetvalue = 0, .writefn = vmsa_tcr_ttbr_el2_write,
   .fieldoffset = offsetof(CPUARMState, cp15.ttbr0_el[2]) },
 { .name = "HTTBR", .cp = 15, .opc1 = 4, .crm = 2,
   .access = PL2_RW, .type = ARM_CP_64BIT | ARM_CP_ALIAS,
@@ -6807,6 +6814,10 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .opc0 = 3, .opc1 = 4, .crn = 13, .crm = 0, .opc2 = 1,
   .access = PL2_RW,
   .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[2]) },
+{ .name = "TTBR1_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
+  .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
+  .fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
 REGINFO_SENTINEL
 };
 define_arm_cp_regs(cpu, vhe_reginfo);
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 15/24] target/arm: Reorganize ARMMMUIdx

2019-07-19 Thread Richard Henderson
Prepare for, but do not yet implement, the EL2&0 regime and the
Secure EL2 regime.  Rename all of the a-profile symbols to make
the distictions clearer.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h   | 180 ++--
 target/arm/internals.h |  46 ---
 target/arm/translate.h |   2 +-
 target/arm/helper.c| 237 ++---
 target/arm/m_helper.c  |   6 +-
 target/arm/translate-a64.c |  11 +-
 target/arm/translate.c |  17 ++-
 7 files changed, 273 insertions(+), 226 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index a0f10b60eb..4b537c4613 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -2764,7 +2764,10 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  *  + NonSecure EL1 & 0 stage 1
  *  + NonSecure EL1 & 0 stage 2
  *  + NonSecure EL2
- *  + Secure EL1 & EL0
+ *  + NonSecure EL2 & 0   (ARMv8.1-VHE)
+ *  + Secure EL0
+ *  + Secure EL1
+ *  + Secure EL2  (ARMv8.4-SecEL2)
  *  + Secure EL3
  * If EL3 is 32-bit:
  *  + NonSecure PL1 & 0 stage 1
@@ -2774,8 +2777,9 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  * (reminder: for 32 bit EL3, Secure PL1 is *EL3*, not EL1.)
  *
  * For QEMU, an mmu_idx is not quite the same as a translation regime because:
- *  1. we need to split the "EL1 & 0" regimes into two mmu_idxes, because they
- * may differ in access permissions even if the VA->PA map is the same
+ *  1. we need to split the "EL1 & 0" and "EL2 & 0" regimes into two mmu_idxes,
+ * because they may differ in access permissions even if the VA->PA map is
+ * the same
  *  2. we want to cache in our TLB the full VA->IPA->PA lookup for a stage 1+2
  * translation, which means that we have one mmu_idx that deals with two
  * concatenated translation regimes [this sort of combined s1+2 TLB is
@@ -2787,19 +2791,26 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  *  4. we can also safely fold together the "32 bit EL3" and "64 bit EL3"
  * translation regimes, because they map reasonably well to each other
  * and they can't both be active at the same time.
- * This gives us the following list of mmu_idx values:
+ *  5. we want to be able to use the TLB for accesses done as part of a
+ * stage1 page table walk, rather than having to walk the stage2 page
+ * table over and over.
  *
- * NS EL0 (aka NS PL0) stage 1+2
- * NS EL1 (aka NS PL1) stage 1+2
+ * This gives us the following list of cases:
+ *
+ * NS EL0 (aka NS PL0) EL1&0 stage 1+2
+ * NS EL1 (aka NS PL1) EL1&0 stage 1+2
+ * NS EL0 (aka NS PL0) EL2&0
+ * NS EL2 (aka NS PL2) EL2&0
  * NS EL2 (aka NS PL2)
- * S EL3 (aka S PL1)
  * S EL0 (aka S PL0)
  * S EL1 (not used if EL3 is 32 bit)
- * NS EL0+1 stage 2
+ * S EL2 (not used if EL3 is 32 bit)
+ * S EL3 (aka S PL1)
+ * NS EL0&1 stage 2
  *
- * (The last of these is an mmu_idx because we want to be able to use the TLB
- * for the accesses done as part of a stage 1 page table walk, rather than
- * having to walk the stage 2 page table over and over.)
+ * We then merge the two NS EL0 cases, and two NS EL2 cases to produce
+ * 8 different mmu_idx.  We retain separate symbols for these four cases
+ * in order to simplify distinguishing them in the code.
  *
  * R profile CPUs have an MPU, but can use the same set of MMU indexes
  * as A profile. They only need to distinguish NS EL0 and NS EL1 (and
@@ -2837,62 +2848,93 @@ static inline bool arm_excp_unmasked(CPUState *cs, 
unsigned int excp_idx,
  * For M profile we arrange them to have a bit for priv, a bit for negpri
  * and a bit for secure.
  */
-#define ARM_MMU_IDX_A 0x10 /* A profile */
-#define ARM_MMU_IDX_NOTLB 0x20 /* does not have a TLB */
-#define ARM_MMU_IDX_M 0x40 /* M profile */
+#define ARM_MMU_IDX_S 0x04  /* Secure */
+#define ARM_MMU_IDX_A 0x10  /* A profile */
+#define ARM_MMU_IDX_M 0x20  /* M profile */
+#define ARM_MMU_IDX_NOTLB 0x100 /* does not have a TLB */
 
-/* meanings of the bits for M profile mmu idx values */
-#define ARM_MMU_IDX_M_PRIV 0x1
+/* Meanings of the bits for A profile mmu idx values */
+#define ARM_MMU_IDX_A_PRIV   0x3
+#define ARM_MMU_IDX_A_EL10   0x40
+#define ARM_MMU_IDX_A_EL20   0x80
+
+/* Meanings of the bits for M profile mmu idx values */
+#define ARM_MMU_IDX_M_PRIV   0x1
 #define ARM_MMU_IDX_M_NEGPRI 0x2
-#define ARM_MMU_IDX_M_S 0x4
 
-#define ARM_MMU_IDX_TYPE_MASK (~0x7)
+#define ARM_MMU_IDX_TYPE_MASK(ARM_MMU_IDX_A | ARM_MMU_IDX_M)
 #define ARM_MMU_IDX_COREIDX_MASK 0x7
 
 typedef enum ARMMMUIdx {
-ARMMMUIdx_S12NSE0 = 0 | ARM_MMU_IDX_A,
-ARMMMUIdx_S12NSE1 = 1 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1E2 = 2 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1E3 = 3 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1SE0 = 4 | ARM_MMU_IDX_A,
-ARMMMUIdx_S1SE1 = 5 | ARM_MMU_IDX_A,
-ARMMMUIdx_S2NS = 6 | ARM_MMU_IDX_A,
+ARMMMUIdx_E0 = 0 | ARM_MMU_IDX_A,
+ARMMMUIdx_E1 = 1 | 

[Qemu-devel] [PATCH for-4.2 07/24] target/arm: Enable HCR_E2H for VHE

2019-07-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h| 7 ---
 target/arm/helper.c | 6 +-
 2 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e6a76d14c6..e37008a4f7 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1366,13 +1366,6 @@ static inline void xpsr_write(CPUARMState *env, uint32_t 
val, uint32_t mask)
 #define HCR_ATA   (1ULL << 56)
 #define HCR_DCT   (1ULL << 57)
 
-/*
- * When we actually implement ARMv8.1-VHE we should add HCR_E2H to
- * HCR_MASK and then clear it again if the feature bit is not set in
- * hcr_write().
- */
-#define HCR_MASK  ((1ULL << 34) - 1)
-
 #define SCR_NS(1U << 0)
 #define SCR_IRQ   (1U << 1)
 #define SCR_FIQ   (1U << 2)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3a9f35bf4b..0a55096770 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -4623,7 +4623,8 @@ static const ARMCPRegInfo el3_no_el2_v8_cp_reginfo[] = {
 static void hcr_write(CPUARMState *env, const ARMCPRegInfo *ri, uint64_t value)
 {
 ARMCPU *cpu = env_archcpu(env);
-uint64_t valid_mask = HCR_MASK;
+/* Begin with bits defined in base ARMv8.0.  */
+uint64_t valid_mask = MAKE_64BIT_MASK(0, 34);
 
 if (arm_feature(env, ARM_FEATURE_EL3)) {
 valid_mask &= ~HCR_HCD;
@@ -4637,6 +4638,9 @@ static void hcr_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
  */
 valid_mask &= ~HCR_TSC;
 }
+if (cpu_isar_feature(aa64_vh, cpu)) {
+valid_mask |= HCR_E2H;
+}
 if (cpu_isar_feature(aa64_lor, cpu)) {
 valid_mask |= HCR_TLOR;
 }
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 21/24] target/arm: Update arm_phys_excp_target_el for TGE

2019-07-19 Thread Richard Henderson
The TGE bit routes all asynchronous exceptions to EL2.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index fe022f51d6..f06e7bcd77 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7928,6 +7928,12 @@ uint32_t arm_phys_excp_target_el(CPUState *cs, uint32_t 
excp_idx,
 break;
 };
 
+/*
+ * For these purposes, TGE and AMO/IMO/FMO both force the
+ * interrupt to EL2.  Fold TGE into the bit extracted above.
+ */
+hcr |= (hcr_el2 & HCR_TGE) != 0;
+
 /* Perform a table-lookup for the target EL given the current state */
 target_el = target_el_table[is64][scr][rw][hcr][secure][cur_el];
 
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 19/24] target/arm: Install asids for E2&0 translation regime

2019-07-19 Thread Richard Henderson
When clearing HCR_E2H, this involves re-installing the E1&0 asid.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 37 -
 1 file changed, 32 insertions(+), 5 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index db13a8f9c0..22eb056b27 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3516,12 +3516,28 @@ static void vmsa_ttbr_el1_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 }
 }
 
+static void update_el2_asid(CPUARMState *env)
+{
+CPUState *cs = env_cpu(env);
+uint64_t ttbr0, ttbr1, ttcr;
+int asid, idxmask;
+
+ttbr0 = env->cp15.ttbr0_el[2];
+ttbr1 = env->cp15.ttbr1_el[2];
+ttcr = env->cp15.tcr_el[2].raw_tcr;
+idxmask = ARMMMUIdxBit_E2 | ARMMMUIdxBit_E0;
+asid = extract64(ttcr & TTBCR_A1 ? ttbr1 : ttbr0, 48, 16);
+
+tlb_set_asid_for_mmuidx(cs, asid, idxmask, 0);
+}
+
 static void vmsa_tcr_ttbr_el2_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
 raw_write(env, ri, value);
 if (arm_hcr_el2_eff(env) & HCR_E2H) {
-/* The ASID field is active.  */
+/* We are running with EL2&0 regime and the ASID is active.  */
+update_el2_asid(env);
 }
 }
 
@@ -4652,6 +4668,7 @@ static void hcr_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 ARMCPU *cpu = env_archcpu(env);
 /* Begin with bits defined in base ARMv8.0.  */
 uint64_t valid_mask = MAKE_64BIT_MASK(0, 34);
+uint64_t old_value;
 
 if (arm_feature(env, ARM_FEATURE_EL3)) {
 valid_mask &= ~HCR_HCD;
@@ -4678,15 +4695,25 @@ static void hcr_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 /* Clear RES0 bits.  */
 value &= valid_mask;
 
-/* These bits change the MMU setup:
+old_value = env->cp15.hcr_el2;
+env->cp15.hcr_el2 = value;
+
+/*
+ * These bits change the MMU setup:
  * HCR_VM enables stage 2 translation
  * HCR_PTW forbids certain page-table setups
- * HCR_DC Disables stage1 and enables stage2 translation
+ * HCR_DC disables stage1 and enables stage2 translation
+ * HCR_E2H enables E2&0 translation regime.
  */
-if ((env->cp15.hcr_el2 ^ value) & (HCR_VM | HCR_PTW | HCR_DC)) {
+if ((old_value ^ value) & (HCR_VM | HCR_PTW | HCR_DC | HCR_E2H)) {
 tlb_flush(CPU(cpu));
+/* Also install the correct ASID for the regime.  */
+if (value & HCR_E2H) {
+update_el2_asid(env);
+} else {
+update_lpae_el1_asid(env, false);
+}
 }
-env->cp15.hcr_el2 = value;
 
 /*
  * Updates to VI and VF require us to update the status of
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 17/24] target/arm: Update arm_mmu_idx for VHE

2019-07-19 Thread Richard Henderson
This covers initial generation in arm_mmu_idx, and reconstruction
in core_to_arm_mmu_idx.  As a conseqeuence, we also need a bit in
TBFLAGS in order to make the latter reliable.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h|  2 ++
 target/arm/helper.c | 42 +++---
 2 files changed, 33 insertions(+), 11 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 4b537c4613..7310adfd9b 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3158,6 +3158,8 @@ FIELD(TBFLAG_ANY, PSTATE_SS, 26, 1)
 /* Target EL if we take a floating-point-disabled exception */
 FIELD(TBFLAG_ANY, FPEXC_EL, 24, 2)
 FIELD(TBFLAG_ANY, BE_DATA, 23, 1)
+/* For A profile only, if EL2 is AA64 and HCR_EL2.E2H is set.  */
+FIELD(TBFLAG_ANY, E2H, 22, 1)
 
 /* Bit usage when in AArch32 state: */
 FIELD(TBFLAG_A32, THUMB, 0, 1)
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 2d5658f9e3..54c328b844 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -11250,21 +11250,29 @@ int fp_exception_el(CPUARMState *env, int cur_el)
 
 ARMMMUIdx core_to_arm_mmu_idx(CPUARMState *env, int mmu_idx)
 {
+bool e2h;
+
 if (arm_feature(env, ARM_FEATURE_M)) {
 return mmu_idx | ARM_MMU_IDX_M;
 }
 
 mmu_idx |= ARM_MMU_IDX_A;
+if (mmu_idx & ARM_MMU_IDX_S) {
+return mmu_idx;
+}
+
+e2h = (env->cp15.hcr_el2 & HCR_E2H) != 0;
+if (!arm_el_is_aa64(env, 2)) {
+e2h = false;
+}
+
 switch (mmu_idx) {
 case ARMMMUIdx_E0:
-return ARMMMUIdx_EL10_0;
+return e2h ? ARMMMUIdx_EL20_0 : ARMMMUIdx_EL10_0;
 case ARMMMUIdx_E1:
 return ARMMMUIdx_EL10_1;
 case ARMMMUIdx_E2:
-case ARMMMUIdx_SE0:
-case ARMMMUIdx_SE1:
-case ARMMMUIdx_SE3:
-return mmu_idx;
+return e2h ? ARMMMUIdx_EL20_2 : ARMMMUIdx_E2;
 default:
 g_assert_not_reached();
 }
@@ -11292,24 +11300,28 @@ ARMMMUIdx arm_v7m_mmu_idx_for_secstate(CPUARMState 
*env, bool secstate)
 
 ARMMMUIdx arm_mmu_idx(CPUARMState *env)
 {
+bool e2h, sec;
 int el;
 
 if (arm_feature(env, ARM_FEATURE_M)) {
 return arm_v7m_mmu_idx_for_secstate(env, env->v7m.secure);
 }
 
+sec = arm_is_secure_below_el3(env);
+e2h = (env->cp15.hcr_el2 & HCR_E2H) != 0;
+if (!arm_el_is_aa64(env, 2)) {
+e2h = false;
+}
+
 el = arm_current_el(env);
 switch (el) {
 case 0:
-/* TODO: ARMv8.1-VHE */
+return sec ? ARMMMUIdx_SE0 : e2h ? ARMMMUIdx_EL20_0 : ARMMMUIdx_EL10_0;
 case 1:
-return (arm_is_secure_below_el3(env)
-? ARMMMUIdx_SE0 + el
-: ARMMMUIdx_EL10_0 + el);
+return sec ? ARMMMUIdx_SE1 : ARMMMUIdx_EL10_1;
 case 2:
-/* TODO: ARMv8.1-VHE */
 /* TODO: ARMv8.4-SecEL2 */
-return ARMMMUIdx_E2;
+return e2h ? ARMMMUIdx_EL20_2 : ARMMMUIdx_E2;
 case 3:
 return ARMMMUIdx_SE3;
 default:
@@ -11421,6 +11433,14 @@ void cpu_get_tb_cpu_state(CPUARMState *env, 
target_ulong *pc,
 
 flags = FIELD_DP32(flags, TBFLAG_ANY, MMUIDX, 
arm_to_core_mmu_idx(mmu_idx));
 
+/*
+ * Include E2H in TBFLAGS so that core_to_arm_mmu_idx can
+ * reliably determine E1&0 vs E2&0 regimes.
+ */
+if (arm_el_is_aa64(env, 2) && (env->cp15.hcr_el2 & HCR_E2H)) {
+flags = FIELD_DP32(flags, TBFLAG_ANY, E2H, 1);
+}
+
 /* The SS_ACTIVE and PSTATE_SS bits correspond to the state machine
  * states defined in the ARM ARM for software singlestep:
  *  SS_ACTIVE   PSTATE.SS   State
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 10/24] target/arm: Update CNTVCT_EL0 for VHE

2019-07-19 Thread Richard Henderson
The virtual offset may be 0 depending on EL, E2H and TGE.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 40 +---
 1 file changed, 37 insertions(+), 3 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index da2e0627b2..3124d682a2 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2484,9 +2484,31 @@ static uint64_t gt_cnt_read(CPUARMState *env, const 
ARMCPRegInfo *ri)
 return gt_get_countervalue(env);
 }
 
+static uint64_t gt_virt_cnt_offset(CPUARMState *env)
+{
+uint64_t hcr;
+
+switch (arm_current_el(env)) {
+case 2:
+hcr = arm_hcr_el2_eff(env);
+if (hcr & HCR_E2H) {
+return 0;
+}
+break;
+case 0:
+hcr = arm_hcr_el2_eff(env);
+if ((hcr & (HCR_E2H | HCR_TGE)) == (HCR_E2H | HCR_TGE)) {
+return 0;
+}
+break;
+}
+
+return env->cp15.cntvoff_el2;
+}
+
 static uint64_t gt_virt_cnt_read(CPUARMState *env, const ARMCPRegInfo *ri)
 {
-return gt_get_countervalue(env) - env->cp15.cntvoff_el2;
+return gt_get_countervalue(env) - gt_virt_cnt_offset(env);
 }
 
 static void gt_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -2501,7 +2523,13 @@ static void gt_cval_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 static uint64_t gt_tval_read(CPUARMState *env, const ARMCPRegInfo *ri,
  int timeridx)
 {
-uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
+uint64_t offset = 0;
+
+switch (timeridx) {
+case GTIMER_VIRT:
+offset = gt_virt_cnt_offset(env);
+break;
+}
 
 return (uint32_t)(env->cp15.c14_timer[timeridx].cval -
   (gt_get_countervalue(env) - offset));
@@ -2511,7 +2539,13 @@ static void gt_tval_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
   int timeridx,
   uint64_t value)
 {
-uint64_t offset = timeridx == GTIMER_VIRT ? env->cp15.cntvoff_el2 : 0;
+uint64_t offset = 0;
+
+switch (timeridx) {
+case GTIMER_VIRT:
+offset = gt_virt_cnt_offset(env);
+break;
+}
 
 trace_arm_gt_tval_write(timeridx, value);
 env->cp15.c14_timer[timeridx].cval = gt_get_countervalue(env) - offset +
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 14/24] target/arm: Simplify tlb_force_broadcast alternatives

2019-07-19 Thread Richard Henderson
Rather than call to a separate function and re-compute any
parameters for the flush, simply use the correct flush
function directly.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 52 +
 1 file changed, 24 insertions(+), 28 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 7adbf51479..2b95fc763f 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -626,56 +626,54 @@ static void tlbiall_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
   uint64_t value)
 {
 /* Invalidate all (TLBIALL) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
 if (tlb_force_broadcast(env)) {
-tlbiall_is_write(env, NULL, value);
-return;
+tlb_flush_all_cpus_synced(cs);
+} else {
+tlb_flush(cs);
 }
-
-tlb_flush(CPU(cpu));
 }
 
 static void tlbimva_write(CPUARMState *env, const ARMCPRegInfo *ri,
   uint64_t value)
 {
 /* Invalidate single TLB entry by MVA and ASID (TLBIMVA) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
+value &= TARGET_PAGE_MASK;
 if (tlb_force_broadcast(env)) {
-tlbimva_is_write(env, NULL, value);
-return;
+tlb_flush_page_all_cpus_synced(cs, value);
+} else {
+tlb_flush_page(cs, value);
 }
-
-tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 }
 
 static void tlbiasid_write(CPUARMState *env, const ARMCPRegInfo *ri,
uint64_t value)
 {
 /* Invalidate by ASID (TLBIASID) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
 if (tlb_force_broadcast(env)) {
-tlbiasid_is_write(env, NULL, value);
-return;
+tlb_flush_all_cpus_synced(cs);
+} else {
+tlb_flush(cs);
 }
-
-tlb_flush(CPU(cpu));
 }
 
 static void tlbimvaa_write(CPUARMState *env, const ARMCPRegInfo *ri,
uint64_t value)
 {
 /* Invalidate single entry by MVA, all ASIDs (TLBIMVAA) */
-ARMCPU *cpu = env_archcpu(env);
+CPUState *cs = env_cpu(env);
 
+value &= TARGET_PAGE_MASK;
 if (tlb_force_broadcast(env)) {
-tlbimvaa_is_write(env, NULL, value);
-return;
+tlb_flush_page_all_cpus_synced(cs, value);
+} else {
+tlb_flush_page(cs, value);
 }
-
-tlb_flush_page(CPU(cpu), value & TARGET_PAGE_MASK);
 }
 
 static void tlbiall_nsnh_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3926,11 +3924,10 @@ static void tlbi_aa64_vmalle1_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 int mask = vae1_tlbmask(env);
 
 if (tlb_force_broadcast(env)) {
-tlbi_aa64_vmalle1is_write(env, NULL, value);
-return;
+tlb_flush_by_mmuidx_all_cpus_synced(cs, mask);
+} else {
+tlb_flush_by_mmuidx(cs, mask);
 }
-
-tlb_flush_by_mmuidx(cs, mask);
 }
 
 static int vmalle1_tlbmask(CPUARMState *env)
@@ -4052,11 +4049,10 @@ static void tlbi_aa64_vae1_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 uint64_t pageaddr = sextract64(value << 12, 0, 56);
 
 if (tlb_force_broadcast(env)) {
-tlbi_aa64_vae1is_write(env, NULL, value);
-return;
+tlb_flush_page_by_mmuidx_all_cpus_synced(cs, pageaddr, mask);
+} else {
+tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
 }
-
-tlb_flush_page_by_mmuidx(cs, pageaddr, mask);
 }
 
 static void tlbi_aa64_vae2is_write(CPUARMState *env, const ARMCPRegInfo *ri,
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 11/24] target/arm: Add the hypervisor virtual counter

2019-07-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/cpu-qom.h |  1 +
 target/arm/cpu.h | 11 +
 target/arm/cpu.c |  2 ++
 target/arm/helper.c  | 57 
 4 files changed, 66 insertions(+), 5 deletions(-)

diff --git a/target/arm/cpu-qom.h b/target/arm/cpu-qom.h
index 2049fa9612..43fc8296db 100644
--- a/target/arm/cpu-qom.h
+++ b/target/arm/cpu-qom.h
@@ -76,6 +76,7 @@ void arm_gt_ptimer_cb(void *opaque);
 void arm_gt_vtimer_cb(void *opaque);
 void arm_gt_htimer_cb(void *opaque);
 void arm_gt_stimer_cb(void *opaque);
+void arm_gt_hvtimer_cb(void *opaque);
 
 #define ARM_AFF0_SHIFT 0
 #define ARM_AFF0_MASK  (0xFFULL << ARM_AFF0_SHIFT)
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index e37008a4f7..bba4e1f984 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -144,11 +144,12 @@ typedef struct ARMGenericTimer {
 uint64_t ctl; /* Timer Control register */
 } ARMGenericTimer;
 
-#define GTIMER_PHYS 0
-#define GTIMER_VIRT 1
-#define GTIMER_HYP  2
-#define GTIMER_SEC  3
-#define NUM_GTIMERS 4
+#define GTIMER_PHYS 0
+#define GTIMER_VIRT 1
+#define GTIMER_HYP  2
+#define GTIMER_SEC  3
+#define GTIMER_HYPVIRT  4
+#define NUM_GTIMERS 5
 
 typedef struct {
 uint64_t raw_tcr;
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 1959467fdc..90352decc5 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1218,6 +1218,8 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
**errp)
   arm_gt_htimer_cb, cpu);
 cpu->gt_timer[GTIMER_SEC] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
   arm_gt_stimer_cb, cpu);
+cpu->gt_timer[GTIMER_HYPVIRT] = timer_new(QEMU_CLOCK_VIRTUAL, GTIMER_SCALE,
+  arm_gt_hvtimer_cb, cpu);
 #endif
 
 cpu_exec_realizefn(cs, _err);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 3124d682a2..329548e45d 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -2527,6 +2527,7 @@ static uint64_t gt_tval_read(CPUARMState *env, const 
ARMCPRegInfo *ri,
 
 switch (timeridx) {
 case GTIMER_VIRT:
+case GTIMER_HYPVIRT:
 offset = gt_virt_cnt_offset(env);
 break;
 }
@@ -2543,6 +2544,7 @@ static void gt_tval_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 
 switch (timeridx) {
 case GTIMER_VIRT:
+case GTIMER_HYPVIRT:
 offset = gt_virt_cnt_offset(env);
 break;
 }
@@ -2698,6 +2700,34 @@ static void gt_sec_ctl_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 gt_ctl_write(env, ri, GTIMER_SEC, value);
 }
 
+static void gt_hv_timer_reset(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+gt_timer_reset(env, ri, GTIMER_HYPVIRT);
+}
+
+static void gt_hv_cval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+gt_cval_write(env, ri, GTIMER_HYPVIRT, value);
+}
+
+static uint64_t gt_hv_tval_read(CPUARMState *env, const ARMCPRegInfo *ri)
+{
+return gt_tval_read(env, ri, GTIMER_HYPVIRT);
+}
+
+static void gt_hv_tval_write(CPUARMState *env, const ARMCPRegInfo *ri,
+ uint64_t value)
+{
+gt_tval_write(env, ri, GTIMER_HYPVIRT, value);
+}
+
+static void gt_hv_ctl_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
+{
+gt_ctl_write(env, ri, GTIMER_HYPVIRT, value);
+}
+
 void arm_gt_ptimer_cb(void *opaque)
 {
 ARMCPU *cpu = opaque;
@@ -2726,6 +2756,13 @@ void arm_gt_stimer_cb(void *opaque)
 gt_recalc_timer(cpu, GTIMER_SEC);
 }
 
+void arm_gt_hvtimer_cb(void *opaque)
+{
+ARMCPU *cpu = opaque;
+
+gt_recalc_timer(cpu, GTIMER_HYPVIRT);
+}
+
 static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
 /* Note that CNTFRQ is purely reads-as-written for the benefit
  * of software; writing it doesn't actually change the timer frequency.
@@ -6852,6 +6889,26 @@ void register_cp_regs_for_features(ARMCPU *cpu)
   .opc0 = 3, .opc1 = 4, .crn = 2, .crm = 0, .opc2 = 1,
   .access = PL2_RW, .writefn = vmsa_tcr_ttbr_el2_write,
   .fieldoffset = offsetof(CPUARMState, cp15.ttbr1_el[2]) },
+#ifndef CONFIG_USER_ONLY
+{ .name = "CNTHV_CVAL_EL2", .state = ARM_CP_STATE_AA64,
+  .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 2,
+  .fieldoffset =
+offsetof(CPUARMState, cp15.c14_timer[GTIMER_HYPVIRT].cval),
+  .type = ARM_CP_IO, .access = PL2_RW,
+  .writefn = gt_hv_cval_write, .raw_writefn = raw_write },
+{ .name = "CNTHV_TVAL_EL2", .state = ARM_CP_STATE_BOTH,
+  .opc0 = 3, .opc1 = 4, .crn = 14, .crm = 3, .opc2 = 0,
+  .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL2_RW,
+  .resetfn = gt_hv_timer_reset,
+  .readfn = gt_hv_tval_read, .writefn = gt_hv_tval_write },
+{ .name = "CNTHV_CTL_EL2", .state = 

[Qemu-devel] [PATCH for-4.2 18/24] target/arm: Update arm_sctlr for VHE

2019-07-19 Thread Richard Henderson
Use this function in many more places in order to select
the correct control.

Signed-off-by: Richard Henderson 
---
 target/arm/cpu.h  | 10 ++
 target/arm/arch_dump.c|  2 +-
 target/arm/helper-a64.c   |  2 +-
 target/arm/helper.c   | 10 +-
 target/arm/pauth_helper.c |  9 +
 5 files changed, 14 insertions(+), 19 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 7310adfd9b..7efbb488d9 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -3104,11 +3104,13 @@ static inline bool arm_sctlr_b(CPUARMState *env)
 static inline uint64_t arm_sctlr(CPUARMState *env, int el)
 {
 if (el == 0) {
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */
-return env->cp15.sctlr_el[1];
-} else {
-return env->cp15.sctlr_el[el];
+if (arm_el_is_aa64(env, 2) && (arm_hcr_el2_eff(env) & HCR_E2H)) {
+el = 2;
+} else {
+el = 1;
+}
 }
+return env->cp15.sctlr_el[el];
 }
 
 
diff --git a/target/arm/arch_dump.c b/target/arm/arch_dump.c
index 26a2c09868..5fbd008d8c 100644
--- a/target/arm/arch_dump.c
+++ b/target/arm/arch_dump.c
@@ -320,7 +320,7 @@ int cpu_get_dump_info(ArchDumpInfo *info,
  * dump a hypervisor that happens to be running an opposite-endian
  * kernel.
  */
-info->d_endian = (env->cp15.sctlr_el[1] & SCTLR_EE) != 0
+info->d_endian = (arm_sctlr(env, 1) & SCTLR_EE) != 0
  ? ELFDATA2MSB : ELFDATA2LSB;
 
 return 0;
diff --git a/target/arm/helper-a64.c b/target/arm/helper-a64.c
index 060699b901..3bf1b731e7 100644
--- a/target/arm/helper-a64.c
+++ b/target/arm/helper-a64.c
@@ -70,7 +70,7 @@ static void daif_check(CPUARMState *env, uint32_t op,
uint32_t imm, uintptr_t ra)
 {
 /* DAIF update to PSTATE. This is OK from EL0 only if UMA is set.  */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UMA)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UMA)) {
 raise_exception_ra(env, EXCP_UDEF,
syn_aa64_sysregtrap(0, extract32(op, 0, 3),
extract32(op, 3, 3), 4,
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 54c328b844..db13a8f9c0 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3868,7 +3868,7 @@ static void aa64_fpsr_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 static CPAccessResult aa64_daif_access(CPUARMState *env, const ARMCPRegInfo 
*ri,
bool isread)
 {
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UMA)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UMA)) {
 return CP_ACCESS_TRAP;
 }
 return CP_ACCESS_OK;
@@ -3887,7 +3887,7 @@ static CPAccessResult aa64_cacheop_access(CPUARMState 
*env,
 /* Cache invalidate/clean: NOP, but EL0 must UNDEF unless
  * SCTLR_EL1.UCI is set.
  */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UCI)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UCI)) {
 return CP_ACCESS_TRAP;
 }
 return CP_ACCESS_OK;
@@ -4114,7 +4114,7 @@ static CPAccessResult aa64_zva_access(CPUARMState *env, 
const ARMCPRegInfo *ri,
 /* We don't implement EL2, so the only control on DC ZVA is the
  * bit in the SCTLR which can prohibit access for EL0.
  */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_DZE)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_DZE)) {
 return CP_ACCESS_TRAP;
 }
 return CP_ACCESS_OK;
@@ -5344,7 +5344,7 @@ static CPAccessResult ctr_el0_access(CPUARMState *env, 
const ARMCPRegInfo *ri,
 /* Only accessible in EL0 if SCTLR.UCT is set (and only in AArch64,
  * but the AArch32 CTR has its own reginfo struct)
  */
-if (arm_current_el(env) == 0 && !(env->cp15.sctlr_el[1] & SCTLR_UCT)) {
+if (arm_current_el(env) == 0 && !(arm_sctlr(env, 0) & SCTLR_UCT)) {
 return CP_ACCESS_TRAP;
 }
 return CP_ACCESS_OK;
@@ -8161,7 +8161,7 @@ static void take_aarch32_exception(CPUARMState *env, int 
new_mode,
 env->uncached_cpsr = (env->uncached_cpsr & ~CPSR_M) | new_mode;
 /* Set new mode endianness */
 env->uncached_cpsr &= ~CPSR_E;
-if (env->cp15.sctlr_el[arm_current_el(env)] & SCTLR_EE) {
+if (arm_sctlr(env, arm_current_el(env)) & SCTLR_EE) {
 env->uncached_cpsr |= CPSR_E;
 }
 /* J and IL must always be cleared for exception entry */
diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
index d3194f2043..42c9141bb7 100644
--- a/target/arm/pauth_helper.c
+++ b/target/arm/pauth_helper.c
@@ -386,14 +386,7 @@ static void pauth_check_trap(CPUARMState *env, int el, 
uintptr_t ra)
 
 static bool pauth_key_enabled(CPUARMState *env, int el, uint32_t bit)
 {
-uint32_t sctlr;
-if (el == 0) {
-/* FIXME: ARMv8.1-VHE S2 translation regime.  */

[Qemu-devel] [PATCH for-4.2 22/24] target/arm: Update regime_is_user for EL2&0

2019-07-19 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index f06e7bcd77..ae3ec9ea67 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8885,6 +8885,7 @@ static inline bool regime_is_user(CPUARMState *env, 
ARMMMUIdx mmu_idx)
 {
 switch (mmu_idx) {
 case ARMMMUIdx_SE0:
+case ARMMMUIdx_EL20_0:
 case ARMMMUIdx_Stage1_E0:
 case ARMMMUIdx_MUser:
 case ARMMMUIdx_MSUser:
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 05/24] target/arm: Install ASIDs for EL2

2019-07-19 Thread Richard Henderson
The VMID is the ASID for the 2nd stage page lookup.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 1ed7c06313..3a9f35bf4b 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3452,17 +3452,23 @@ static void vmsa_ttbr_el1_write(CPUARMState *env, const 
ARMCPRegInfo *ri,
 static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
 uint64_t value)
 {
-ARMCPU *cpu = env_archcpu(env);
-CPUState *cs = CPU(cpu);
+CPUState *cs = env_cpu(env);
+int vmid;
 
-/* Accesses to VTTBR may change the VMID so we must flush the TLB.  */
-if (raw_read(env, ri) != value) {
-tlb_flush_by_mmuidx(cs,
-ARMMMUIdxBit_S12NSE1 |
-ARMMMUIdxBit_S12NSE0 |
-ARMMMUIdxBit_S2NS);
-raw_write(env, ri, value);
-}
+raw_write(env, ri, value);
+
+/*
+ * TODO: with ARMv8.1-VMID16, aarch64 must examine VTCR.VS
+ * (re-evaluating with changes to VTCR) then use bits [63:48].
+ */
+vmid = extract64(value, 48, 8);
+
+/*
+ * A change in VMID to the stage2 page table (S2NS) invalidates
+ * the combined stage 1&2 tlbs (S12NSE1 and S12NSE0).
+ */
+tlb_set_asid_for_mmuidx(cs, vmid, ARMMMUIdxBit_S2NS,
+ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0);
 }
 
 static const ARMCPRegInfo vmsa_pmsa_cp_reginfo[] = {
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 02/24] cputlb: Add tlb_flush_asid_by_mmuidx and friends

2019-07-19 Thread Richard Henderson
Since we have remembered ASIDs, we can further minimize flushing
by comparing against the one we want to flush.

Signed-off-by: Richard Henderson 
---
 include/exec/exec-all.h | 16 +
 include/qom/cpu.h   |  1 +
 accel/tcg/cputlb.c  | 51 +
 3 files changed, 68 insertions(+)

diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 9c77aa5bf9..0d890e1e60 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -240,6 +240,22 @@ void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, 
uint16_t idxmap);
  */
 void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
  uint16_t idxmap, uint16_t dep_idxmap);
+/**
+ * tlb_flush_asid_by_mmuidx:
+ * @cpu: Originating CPU of the flush
+ * @asid: Address Space Identifier
+ * @idxmap: bitmap of MMU indexes to flush if asid matches
+ *
+ * For each mmu index, if @asid matches the value previously saved via
+ * tlb_set_asid_for_mmuidx, flush the index.
+ */
+void tlb_flush_asid_by_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap);
+/* Similarly, broadcasting to all cpus. */
+void tlb_flush_asid_by_mmuidx_all_cpus(CPUState *cpu, uint32_t asid,
+   uint16_t idxmap);
+/* Similarly, waiting for the broadcast to complete.  */
+void tlb_flush_asid_by_mmuidx_all_cpus_synced(CPUState *cpu, uint32_t asid,
+  uint16_t idxmap);
 /**
  * tlb_set_page_with_attrs:
  * @cpu: CPU to add this TLB entry for
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 5ee0046b62..4ae6ea3e1d 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -283,6 +283,7 @@ struct hax_vcpu_state;
 typedef union {
 int   host_int;
 unsigned long host_ulong;
+uint64_t  host_uint64;
 void *host_ptr;
 vaddr target_ptr;
 } run_on_cpu_data;
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c68f57755b..3ef68a11bf 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -540,6 +540,57 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, 
target_ulong addr)
 tlb_flush_page_by_mmuidx_all_cpus_synced(src, addr, ALL_MMUIDX_BITS);
 }
 
+static void tlb_flush_asid_by_mmuidx_async_work(CPUState *cpu,
+run_on_cpu_data data)
+{
+CPUTLB *tlb = cpu_tlb(cpu);
+uint32_t asid = data.host_uint64;
+uint16_t idxmap = data.host_uint64 >> 32;
+uint16_t to_flush = 0, work;
+
+assert_cpu_is_self(cpu);
+
+for (work = idxmap; work != 0; work &= work - 1) {
+int mmu_idx = ctz32(work);
+if (tlb->d[mmu_idx].asid == asid) {
+to_flush |= 1 << mmu_idx;
+}
+}
+
+if (to_flush) {
+tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(to_flush));
+}
+}
+
+void tlb_flush_asid_by_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap)
+{
+run_on_cpu_data data = { .host_uint64 = deposit64(asid, 32, 32, idxmap) };
+
+if (cpu->created && !qemu_cpu_is_self(cpu)) {
+async_run_on_cpu(cpu, tlb_flush_asid_by_mmuidx_async_work, data);
+} else {
+tlb_flush_asid_by_mmuidx_async_work(cpu, data);
+}
+}
+
+void tlb_flush_asid_by_mmuidx_all_cpus(CPUState *src_cpu,
+   uint32_t asid, uint16_t idxmap)
+{
+run_on_cpu_data data = { .host_uint64 = deposit64(asid, 32, 32, idxmap) };
+
+flush_all_helper(src_cpu, tlb_flush_asid_by_mmuidx_async_work, data);
+tlb_flush_asid_by_mmuidx_async_work(src_cpu, data);
+}
+
+void tlb_flush_asid_by_mmuidx_all_cpus_synced(CPUState *src_cpu,
+  uint32_t asid, uint16_t idxmap)
+{
+run_on_cpu_data data = { .host_uint64 = deposit64(asid, 32, 32, idxmap) };
+
+flush_all_helper(src_cpu, tlb_flush_asid_by_mmuidx_async_work, data);
+async_safe_run_on_cpu(src_cpu, tlb_flush_asid_by_mmuidx_async_work, data);
+}
+
 void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap,
  uint16_t depmap)
 {
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 04/24] target/arm: Install ASIDs for short-form from EL1

2019-07-19 Thread Richard Henderson
This is less complex than the LPAE case, but still we now avoid the
flush in case it is only the PROCID field that is changing.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 34 --
 1 file changed, 24 insertions(+), 10 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0f21a077de..1ed7c06313 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -551,17 +551,31 @@ static void fcse_write(CPUARMState *env, const 
ARMCPRegInfo *ri, uint64_t value)
 static void contextidr_write(CPUARMState *env, const ARMCPRegInfo *ri,
  uint64_t value)
 {
-ARMCPU *cpu = env_archcpu(env);
-
-if (raw_read(env, ri) != value && !arm_feature(env, ARM_FEATURE_PMSA)
-&& !extended_addresses_enabled(env)) {
-/* For VMSA (when not using the LPAE long descriptor page table
- * format) this register includes the ASID, so do a TLB flush.
- * For PMSA it is purely a process ID and no action is needed.
- */
-tlb_flush(CPU(cpu));
-}
 raw_write(env, ri, value);
+
+/*
+ * For VMSA (when not using the LPAE long descriptor page table format)
+ * this register includes the ASID.  For PMSA it is purely a process ID
+ * and no action is needed.
+ */
+if (!arm_feature(env, ARM_FEATURE_PMSA) &&
+!extended_addresses_enabled(env)) {
+CPUState *cs = env_cpu(env);
+int asid = extract32(value, 0, 8);
+int idxmask;
+
+switch (ri->secure) {
+case ARM_CP_SECSTATE_S:
+idxmask = ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0;
+break;
+case ARM_CP_SECSTATE_NS:
+idxmask = ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
+break;
+default:
+g_assert_not_reached();
+}
+tlb_set_asid_for_mmuidx(cs, asid, idxmask, 0);
+}
 }
 
 /* IS variants of TLB operations must affect all cores */
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 01/24] cputlb: Add tlb_set_asid_for_mmuidx

2019-07-19 Thread Richard Henderson
Although we can't do much with ASIDs except remember them, this
will allow cleanups within target/ that should make things clearer.

Signed-off-by: Richard Henderson 

---
v2: Assert cpu_is_self; only flush idx w/ asid mismatch.
---
 include/exec/cpu-all.h  | 11 +++
 include/exec/cpu-defs.h |  2 ++
 include/exec/exec-all.h | 19 +++
 accel/tcg/cputlb.c  | 26 ++
 4 files changed, 58 insertions(+)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index 536ea58f81..40b140cbba 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -439,4 +439,15 @@ static inline CPUTLB *env_tlb(CPUArchState *env)
 return _neg(env)->tlb;
 }
 
+/**
+ * cpu_tlb(env)
+ * @cpu: The generic CPUState
+ *
+ * Return the CPUTLB state associated with the cpu.
+ */
+static inline CPUTLB *cpu_tlb(CPUState *cpu)
+{
+return _neg(cpu)->tlb;
+}
+
 #endif /* CPU_ALL_H */
diff --git a/include/exec/cpu-defs.h b/include/exec/cpu-defs.h
index 9bc713a70b..73584841c0 100644
--- a/include/exec/cpu-defs.h
+++ b/include/exec/cpu-defs.h
@@ -169,6 +169,8 @@ typedef struct CPUTLBDesc {
 size_t n_used_entries;
 /* The next index to use in the tlb victim table.  */
 size_t vindex;
+/* The current ASID for this tlb.  */
+uint32_t asid;
 /* The tlb victim table, in two parts.  */
 CPUTLBEntry vtable[CPU_VTLB_SIZE];
 CPUIOTLBEntry viotlb[CPU_VTLB_SIZE];
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 16034ee651..9c77aa5bf9 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -225,6 +225,21 @@ void tlb_flush_by_mmuidx_all_cpus(CPUState *cpu, uint16_t 
idxmap);
  * depend on when the guests translation ends the TB.
  */
 void tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu, uint16_t idxmap);
+/**
+ * tlb_set_asid_for_mmuidx:
+ * @cpu: Originating cpu
+ * @asid: Address Space Identifier
+ * @idxmap: bitmap of MMU indexes to set to @asid
+ * @depmap: bitmap of dependent MMU indexes
+ *
+ * Set an ASID for all of @idxmap.  If any previous ASID was different,
+ * then we will flush the mmu idx.  If a flush is required, then also flush
+ * all dependent mmu indicies in @depmap.  This latter is typically used
+ * for secondary page resolution, for implementing virtualization within
+ * the guest.
+ */
+void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
+ uint16_t idxmap, uint16_t dep_idxmap);
 /**
  * tlb_set_page_with_attrs:
  * @cpu: CPU to add this TLB entry for
@@ -310,6 +325,10 @@ static inline void 
tlb_flush_by_mmuidx_all_cpus_synced(CPUState *cpu,
uint16_t idxmap)
 {
 }
+static inline void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid,
+   uint16_t idxmap, uint16_t depmap)
+{
+}
 #endif
 
 #define CODE_GEN_ALIGN   16 /* must be >= of the size of a icache line 
*/
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index bb9897b25a..c68f57755b 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -540,6 +540,32 @@ void tlb_flush_page_all_cpus_synced(CPUState *src, 
target_ulong addr)
 tlb_flush_page_by_mmuidx_all_cpus_synced(src, addr, ALL_MMUIDX_BITS);
 }
 
+void tlb_set_asid_for_mmuidx(CPUState *cpu, uint32_t asid, uint16_t idxmap,
+ uint16_t depmap)
+{
+CPUTLB *tlb = cpu_tlb(cpu);
+uint16_t work, to_flush = 0;
+
+/* It doesn't make sense to set context across cpus.  */
+assert_cpu_is_self(cpu);
+
+/*
+ * We don't support ASIDs except for trivially.
+ * If there is any change, then we must flush the TLB.
+ */
+for (work = idxmap; work != 0; work &= work - 1) {
+int mmu_idx = ctz32(work);
+if (tlb->d[mmu_idx].asid != asid) {
+tlb->d[mmu_idx].asid = asid;
+to_flush |= 1 << mmu_idx;
+}
+}
+if (to_flush) {
+to_flush |= depmap;
+tlb_flush_by_mmuidx_async_work(cpu, RUN_ON_CPU_HOST_INT(to_flush));
+}
+}
+
 /* update the TLBs so that writes to code in the virtual page 'addr'
can be detected */
 void tlb_protect_code(ram_addr_t ram_addr)
-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 00/24] target/arm: Implement ARMv8.1-VHE

2019-07-19 Thread Richard Henderson
About half of this patch set is cleanup of the qemu tlb handling
leading up to the actual implementation of VHE, and the biggest
piece of that: The EL2&0 translation regime.

Testing so far has been limited to booting a debian 9 system with
a 4.9 kernel, and a fedora 30 system with a 5.1 kernel.  Both have
KVM enabled, and both report enabling VHE is successful.


r~


Richard Henderson (24):
  cputlb: Add tlb_set_asid_for_mmuidx
  cputlb: Add tlb_flush_asid_by_mmuidx and friends
  target/arm: Install ASIDs for long-form from EL1
  target/arm: Install ASIDs for short-form from EL1
  target/arm: Install ASIDs for EL2
  target/arm: Define isar_feature_aa64_vh
  target/arm: Enable HCR_E2H for VHE
  target/arm: Add CONTEXTIDR_EL2
  target/arm: Add TTBR1_EL2
  target/arm: Update CNTVCT_EL0 for VHE
  target/arm: Add the hypervisor virtual counter
  target/arm: Add VHE system register redirection and aliasing
  target/arm: Split out vae1_tlbmask, vmalle1_tlbmask
  target/arm: Simplify tlb_force_broadcast alternatives
  target/arm: Reorganize ARMMMUIdx
  target/arm: Add regime_has_2_ranges
  target/arm: Update arm_mmu_idx for VHE
  target/arm: Update arm_sctlr for VHE
  target/arm: Install asids for E2&0 translation regime
  target/arm: Flush tlbs for E2&0 translation regime
  target/arm: Update arm_phys_excp_target_el for TGE
  target/arm: Update regime_is_user for EL2&0
  target/arm: Update {fp,sve}_exception_el for VHE
  target/arm: Enable ARMv8.1-VHE in -cpu max

 include/exec/cpu-all.h |  11 +
 include/exec/cpu-defs.h|   2 +
 include/exec/exec-all.h|  35 ++
 include/qom/cpu.h  |   1 +
 target/arm/cpu-qom.h   |   1 +
 target/arm/cpu.h   | 259 +-
 target/arm/internals.h |  62 ++-
 target/arm/translate.h |   2 +-
 accel/tcg/cputlb.c |  77 +++
 target/arm/arch_dump.c |   2 +-
 target/arm/cpu.c   |   2 +
 target/arm/cpu64.c |   1 +
 target/arm/debug_helper.c  |  50 +-
 target/arm/helper-a64.c|   2 +-
 target/arm/helper.c| 985 ++---
 target/arm/m_helper.c  |   6 +-
 target/arm/pauth_helper.c  |   9 +-
 target/arm/translate-a64.c |  14 +-
 target/arm/translate.c |  17 +-
 19 files changed, 1058 insertions(+), 480 deletions(-)

-- 
2.17.1




[Qemu-devel] [PATCH for-4.2 03/24] target/arm: Install ASIDs for long-form from EL1

2019-07-19 Thread Richard Henderson
In addition to providing the core with the current ASID, this minimizes
both the number of flushes due to non-changing ASID as well as the set
of mmu_idx that are affected by each flush.

In particular, updates to the secure mode registers flushes only the
relevant secure mode mmu_idx's, and similarly non-secure updates only
affect non-secure mmu_idx's.

Signed-off-by: Richard Henderson 
---
 target/arm/helper.c | 73 +
 1 file changed, 48 insertions(+), 25 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 20f8728be1..0f21a077de 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -3327,6 +3327,36 @@ static const ARMCPRegInfo pmsav5_cp_reginfo[] = {
 REGINFO_SENTINEL
 };
 
+/* Called after a change to any of TTBR*_EL1 or TTBCR_EL1.  */
+static void update_lpae_el1_asid(CPUARMState *env, int secure)
+{
+CPUState *cs = env_cpu(env);
+uint64_t ttbr0, ttbr1, ttcr;
+int asid, idxmask;
+
+switch (secure) {
+case ARM_CP_SECSTATE_S:
+ttbr0 = env->cp15.ttbr0_s;
+ttbr1 = env->cp15.ttbr1_s;
+ttcr = env->cp15.tcr_el[3].raw_tcr;
+/* Note that cp15.ttbr0_s == cp15.ttbr0_el[3], so S1E3 is affected.  */
+/* ??? Secure EL3 really using the ASID field?  Doesn't make sense.  */
+idxmask = ARMMMUIdxBit_S1SE1 | ARMMMUIdxBit_S1SE0 | ARMMMUIdxBit_S1E3;
+break;
+case ARM_CP_SECSTATE_NS:
+ttbr0 = env->cp15.ttbr0_ns;
+ttbr1 = env->cp15.ttbr1_ns;
+ttcr = env->cp15.tcr_el[1].raw_tcr;
+idxmask = ARMMMUIdxBit_S12NSE1 | ARMMMUIdxBit_S12NSE0;
+break;
+default:
+g_assert_not_reached();
+}
+asid = extract64(ttcr & TTBCR_A1 ? ttbr1 : ttbr0, 48, 16);
+
+tlb_set_asid_for_mmuidx(cs, asid, idxmask, 0);
+}
+
 static void vmsa_ttbcr_raw_write(CPUARMState *env, const ARMCPRegInfo *ri,
  uint64_t value)
 {
@@ -3363,18 +3393,16 @@ static void vmsa_ttbcr_raw_write(CPUARMState *env, 
const ARMCPRegInfo *ri,
 static void vmsa_ttbcr_write(CPUARMState *env, const ARMCPRegInfo *ri,
  uint64_t value)
 {
-ARMCPU *cpu = env_archcpu(env);
 TCR *tcr = raw_ptr(env, ri);
 
-if (arm_feature(env, ARM_FEATURE_LPAE)) {
-/* With LPAE the TTBCR could result in a change of ASID
- * via the TTBCR.A1 bit, so do a TLB flush.
- */
-tlb_flush(CPU(cpu));
-}
 /* Preserve the high half of TCR_EL1, set via TTBCR2.  */
 value = deposit64(tcr->raw_tcr, 0, 32, value);
 vmsa_ttbcr_raw_write(env, ri, value);
+
+if (arm_feature(env, ARM_FEATURE_LPAE)) {
+/* The A1 bit controls which ASID is active.  */
+update_lpae_el1_asid(env, ri->secure);
+}
 }
 
 static void vmsa_ttbcr_reset(CPUARMState *env, const ARMCPRegInfo *ri)
@@ -3392,24 +3420,19 @@ static void vmsa_ttbcr_reset(CPUARMState *env, const 
ARMCPRegInfo *ri)
 static void vmsa_tcr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
uint64_t value)
 {
-ARMCPU *cpu = env_archcpu(env);
-TCR *tcr = raw_ptr(env, ri);
-
-/* For AArch64 the A1 bit could result in a change of ASID, so TLB flush. 
*/
-tlb_flush(CPU(cpu));
-tcr->raw_tcr = value;
+raw_write(env, ri, value);
+/* The A1 bit controls which ASID is active.  */
+update_lpae_el1_asid(env, ri->secure);
 }
 
-static void vmsa_ttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
-uint64_t value)
+static void vmsa_ttbr_el1_write(CPUARMState *env, const ARMCPRegInfo *ri,
+uint64_t value)
 {
-/* If the ASID changes (with a 64-bit write), we must flush the TLB.  */
-if (cpreg_field_is_64bit(ri) &&
-extract64(raw_read(env, ri) ^ value, 48, 16) != 0) {
-ARMCPU *cpu = env_archcpu(env);
-tlb_flush(CPU(cpu));
-}
 raw_write(env, ri, value);
+if (cpreg_field_is_64bit(ri)) {
+/* The LPAE format (64-bit write) contains an ASID field.  */
+update_lpae_el1_asid(env, ri->secure);
+}
 }
 
 static void vttbr_write(CPUARMState *env, const ARMCPRegInfo *ri,
@@ -3455,12 +3478,12 @@ static const ARMCPRegInfo vmsa_cp_reginfo[] = {
   .fieldoffset = offsetof(CPUARMState, cp15.esr_el[1]), .resetvalue = 0, },
 { .name = "TTBR0_EL1", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 0,
-  .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+  .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0,
   .bank_fieldoffsets = { offsetof(CPUARMState, cp15.ttbr0_s),
  offsetof(CPUARMState, cp15.ttbr0_ns) } },
 { .name = "TTBR1_EL1", .state = ARM_CP_STATE_BOTH,
   .opc0 = 3, .opc1 = 0, .crn = 2, .crm = 0, .opc2 = 1,
-  .access = PL1_RW, .writefn = vmsa_ttbr_write, .resetvalue = 0,
+  .access = PL1_RW, .writefn = vmsa_ttbr_el1_write, .resetvalue = 0,
 

Re: [Qemu-devel] [PATCH 1/1] x86: add CPU flags supported inside libvirt

2019-07-19 Thread Eduardo Habkost
On Thu, Jul 18, 2019 at 04:45:37PM +0300, Denis V. Lunev wrote:
> There are the following flags available in libvirt inside cpu_map.xm
> 
>   

This is bit 18...

> 
>  
>   
> 
> We have faced the problem that QEMU does not start once these flags are
> present in the domain.xml.
> 
> This patch just adds proper names into the map.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Paolo Bonzini 
> CC: Richard Henderson 
> CC: Eduardo Habkost 
> CC: Nikolay Shirokovskiy 
> CC: Peter Krempa 
> CC: Daniel P. Berrangé 
> ---
>  target/i386/cpu.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 805ce95247..88ba4dad47 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -870,7 +870,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
> {
>  "lahf-lm", "cmp-legacy", "svm", "extapic",
>  "cr8legacy", "abm", "sse4a", "misalignsse",
>  "3dnowprefetch", "osvw", "ibs", "xop",
> -"skinit", "wdt", NULL, "lwp",
> +"skinit", "wdt", "cvt16", "lwp",

...this is bit 14.

Anyway, the cvt16 bit was removed on purpose, and was never
supported.  See:
http://mid.mail-archive.com/508091FB.1030705@amd.com

>  "fma4", "tce", NULL, "nodeid-msr",
>  NULL, "tbm", "topoext", "perfctr-core",
>  "perfctr-nb", NULL, NULL, NULL,
> @@ -1044,7 +1044,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] 
> = {
>  "fsgsbase", "tsc-adjust", NULL, "bmi1",
>  "hle", "avx2", NULL, "smep",
>  "bmi2", "erms", "invpcid", "rtm",
> -NULL, NULL, "mpx", NULL,
> +"cmt", NULL, "mpx", NULL,

This is one is named "cqm" on Linux (X86_FEATURE_CQM).  I prefer
to keep consistency with the name already in use by Linux than
the one in libvirt that was never used.

You can still add a "cmt" alias property if you think it would be
useful.

Also, I see no code implementing migration of MSR_IA32_QM_EVTSEL,
MSR_IA32_QM_CTR, or other RDT-M state.  If the feature is not
safe for migration yet, you need to explicitly add the feature to
.unmigratable_flags.


>  "avx512f", "avx512dq", "rdseed", "adx",
>  "smap", "avx512ifma", "pcommit", "clflushopt",
>  "clwb", "intel-pt", "avx512pf", "avx512er",
> -- 
> 2.17.1
> 

-- 
Eduardo



Re: [Qemu-devel] [PATCH v3] qapi: Add InetSocketAddress member keep-alive

2019-07-19 Thread Eric Blake
On 6/25/19 8:32 AM, Markus Armbruster wrote:
> I apologize for dragging my feet on this review.
> 
> Vladimir Sementsov-Ogievskiy  writes:
> 
>> It's needed to provide keepalive for nbd client to track server
>> availability.
>>
>> Signed-off-by: Vladimir Sementsov-Ogievskiy 
>> ---
>>

>> +++ b/qapi/sockets.json
>> @@ -53,6 +53,9 @@
>>  #
>>  # @ipv6: whether to accept IPv6 addresses, default try both IPv4 and IPv6
>>  #
>> +# @keep-alive: enable keep-alive when connecting to this socket. Not 
>> supported
>> +#  for server-side connections. (Since 4.1)

It looks like this missed 4.1.  Are you planning on sending v4, to address

>> +#
> 
> Is "server-side connection" is an established term?
> 
> For what it's worth, "passive socket" is, see listen(2).
> 
>>  # Since: 1.3
>>  ##
>>  { 'struct': 'InetSocketAddress',
>> @@ -61,7 +64,8 @@
>>  '*numeric':  'bool',
>>  '*to': 'uint16',
>>  '*ipv4': 'bool',
>> -'*ipv6': 'bool' } }
>> +'*ipv6': 'bool',
>> +'*keep-alive': 'bool' } }
>>  
>>  ##
>>  # @UnixSocketAddress:
>> diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
>> index 8850a280a8..813063761b 100644
>> --- a/util/qemu-sockets.c
>> +++ b/util/qemu-sockets.c
>> @@ -438,6 +438,12 @@ int inet_connect_saddr(InetSocketAddress *saddr, Error 
>> **errp)
>>  struct addrinfo *res, *e;
>>  int sock = -1;
>>  
>> +if (saddr->keep_alive) {
>> +error_setg(errp, "keep-alive options is not supported for 
>> server-side "
>> +   "connection");
>> +return -1;
>> +}
>> +
>>  res = inet_parse_connect_saddr(saddr, errp);
>>  if (!res) {
>>  return -1;
> 
> I'm afraid you added this to the wrong function; ...
> 
>> @@ -457,6 +463,19 @@ int inet_connect_saddr(InetSocketAddress *saddr, Error 
>> **errp)
>>  }
>>  
>>  freeaddrinfo(res);
>> +
>> +if (saddr->keep_alive) {
> 
> ... it renders this code unreachable.
> 
> I guess the "not supported for passive sockets" check should go into
> inet_listen_saddr() instead.

this comment?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PULL 1/1] nbd: Initialize reply on failure

2019-07-19 Thread Eric Blake
We've had two separate reports of different callers running into use
of uninitialized data if s->quit is set (one detected by gcc -O3,
another by valgrind), due to checking 'nbd_reply_is_simple(reply) ||
s->quit' in the wrong order. Rather than chasing down which callers
need to pre-initialize reply, and whether there are any other
uninitialized uses, it's easier to guarantee that reply will always be
set by nbd_co_receive_one_chunk() even on failure.

The uninitialized use happens to be harmless (the only time the
variable is uninitialized is if s->quit is set, so the conditional
results in the same action regardless of what was read from reply),
and was introduced in commit 65e01d47.

In fixing the problem, it can also be seen that all (one) callers pass
in a non-NULL reply, so there is a dead conditional to also be cleaned
up.

Reported-by: Thomas Huth 
Reported-by: Andrey Shinkevich 
Signed-off-by: Eric Blake 
Message-Id: <20190719172001.19770-1-ebl...@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé 
---
 block/nbd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 81edabbf35ed..57c1a205811a 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -640,12 +640,11 @@ static coroutine_fn int nbd_co_receive_one_chunk(
   request_ret, qiov, payload, errp);

 if (ret < 0) {
+memset(reply, 0, sizeof(*reply));
 s->quit = true;
 } else {
 /* For assert at loop start in nbd_connection_entry */
-if (reply) {
-*reply = s->reply;
-}
+*reply = s->reply;
 s->reply.handle = 0;
 }

-- 
2.20.1




[Qemu-devel] [PULL 0/1] NBD patches for -rc2

2019-07-19 Thread Eric Blake
The following changes since commit e2b47666fe1544959c89bd3ed159e9e37cc9fc73:

  Merge remote-tracking branch 'remotes/berrange/tags/misc-next-pull-request' 
into staging (2019-07-19 14:29:13 +0100)

are available in the Git repository at:

  https://repo.or.cz/qemu/ericb.git tags/pull-nbd-2019-07-19

for you to fetch changes up to 5cf42b1c1f75499b467701926d3c9691d27712e1:

  nbd: Initialize reply on failure (2019-07-19 13:19:18 -0500)


nbd patches for 2019-07-19

- silence harmless compiler/valgrind warning


Eric Blake (1):
  nbd: Initialize reply on failure

 block/nbd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

-- 
2.20.1




[Qemu-devel] [Bug 1818937] Re: Crash with HV_ERROR on macOS host

2019-07-19 Thread Roman Bolshakov
I'm looking into the issue... HV_ERROR is a high-level return value and
doesn't give enough details about the nature of the error. The error is
returned from vmexit handler in AppleHV.kext (which implements kernel
part of Hypervisor.framework). Perhaps we should extract more data from
the VMCS and print it before aborting the execution.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1818937

Title:
  Crash with HV_ERROR on macOS host

Status in QEMU:
  New

Bug description:
  On macOS host running Windows 10 guest, qemu crashed with error
  message: Error: HV_ERROR.

  Host: macOS Mojave 10.14.3 (18D109) Late 2014 Mac mini presumably Core i5 
4278U.
  QEMU: git commit a3e3b0a7bd5de211a62cdf2d6c12b96d3c403560
  QEMU parameter: qemu-system-x86_64 -m 3000 -drive 
file=disk.img,if=virtio,discard=unmap -accel hvf -soundhw hda -smp 3

  thread list
  Process 56054 stopped
thread #1: tid = 0x2ffec8, 0x7fff48d0805a vImage`vLookupTable_Planar16 
+ 970, queue = 'com.apple.main-thread'
thread #2: tid = 0x2ffecc, 0x7fff79d6d7de 
libsystem_kernel.dylib`__psynch_cvwait + 10
thread #3: tid = 0x2ffecd, 0x7fff79d715aa 
libsystem_kernel.dylib`__select + 10
thread #4: tid = 0x2ffece, 0x7fff79d71d9a 
libsystem_kernel.dylib`__sigwait + 10
  * thread #6: tid = 0x2ffed0, 0x7fff79d7023e 
libsystem_kernel.dylib`__pthread_kill + 10, stop reason = signal SIGABRT
thread #7: tid = 0x2ffed1, 0x7fff79d6d7de 
libsystem_kernel.dylib`__psynch_cvwait + 10
thread #8: tid = 0x2ffed2, 0x7fff79d6d7de 
libsystem_kernel.dylib`__psynch_cvwait + 10
thread #11: tid = 0x2fff34, 0x7fff79d6a17a 
libsystem_kernel.dylib`mach_msg_trap + 10, name = 'com.apple.NSEventThread'
thread #30: tid = 0x300c04, 0x7fff79e233f8 
libsystem_pthread.dylib`start_wqthread
thread #31: tid = 0x300c16, 0x7fff79e233f8 
libsystem_pthread.dylib`start_wqthread
thread #32: tid = 0x300c17, 0x
thread #33: tid = 0x300c93, 0x7fff79d6d7de 
libsystem_kernel.dylib`__psynch_cvwait + 10

  
  Crashed thread:

  * thread #6, stop reason = signal SIGABRT
* frame #0: 0x7fff79d7023e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x7fff79e26c1c libsystem_pthread.dylib`pthread_kill + 285
  frame #2: 0x7fff79cd91c9 libsystem_c.dylib`abort + 127
  frame #3: 0x00010baa476d 
qemu-system-x86_64`assert_hvf_ok(ret=) at hvf.c:106 [opt]
  frame #4: 0x00010baa4c8f 
qemu-system-x86_64`hvf_vcpu_exec(cpu=0x7f8e5283de00) at hvf.c:681 [opt]
  frame #5: 0x00010b988423 
qemu-system-x86_64`qemu_hvf_cpu_thread_fn(arg=0x7f8e5283de00) at 
cpus.c:1636 [opt]
  frame #6: 0x00010bd9dfce 
qemu-system-x86_64`qemu_thread_start(args=) at 
qemu-thread-posix.c:502 [opt]
  frame #7: 0x7fff79e24305 libsystem_pthread.dylib`_pthread_body + 126
  frame #8: 0x7fff79e2726f libsystem_pthread.dylib`_pthread_start + 70
  frame #9: 0x7fff79e23415 libsystem_pthread.dylib`thread_start + 13

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1818937/+subscriptions



[Qemu-devel] [PATCH 1/2] net: assert that tx packets have nonzero size

2019-07-19 Thread Oleinik, Alexander
Virtual devices should not try to send zero-sized packets. The caller
should check the size prior to calling qemu_sendv_packet_async.

Signed-off-by: Alexander Oleinik 
---
 net/net.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/net.c b/net/net.c
index 7d4098254f..fad20bc611 100644
--- a/net/net.c
+++ b/net/net.c
@@ -741,6 +741,9 @@ ssize_t qemu_sendv_packet_async(NetClientState *sender,
 size_t size = iov_size(iov, iovcnt);
 int ret;
 
+/* 0-sized packets are unsupported. Check size in the caller */
+assert(size);
+
 if (size > NET_BUFSIZE) {
 return size;
 }
-- 
2.20.1




[Qemu-devel] [PATCH 0/2] Avoid sending zero-size packets

2019-07-19 Thread Oleinik, Alexander
While fuzzing virtio-net I found that attempting to send
a zero-size packet leads to an assertion failure, when resetting the
device. These patches add an assertion to net/net.c to ensure that
virtual devices do not try to send zero-size packets and change
virtio-net to check that packets have non-zero size, prior to sending.

Alexander Oleinik (2):
  net: assert that tx packets have nonzero size
  virtio-net: check that tx packet has positive size

 hw/net/virtio-net.c | 15 +--
 net/net.c   |  3 +++
 2 files changed, 12 insertions(+), 6 deletions(-)

-- 
2.20.1




[Qemu-devel] [PATCH 2/2] virtio-net: check that tx packet has positive size

2019-07-19 Thread Oleinik, Alexander
virtio_net_flush_tx does not check that the packet size is nonzero,
which causes q->aysnc_tx.elem to be set. Then, when the device is reset, there
is an assertion failure since q->aysnc_tx.elem must be flushed/cleared.
Zero-sized packets are unsupported - check packet size, prior to
sending.

Found while fuzzing virtio-net. The relevant stack-trace:
 #8 in __assert_fail (libc.so.6+0x30101)
 #9 in virtio_net_reset virtio-net.c:520:13
 #10 in virtio_reset virtio.c:1214:9
 #11 in virtio_pci_reset virtio-pci.c:1810:5
 #12 in qdev_reset_one qdev.c:259:5
 #13 in qdev_walk_children qdev.c:575:15
 #14 in qbus_walk_children bus.c:52:15
 #15 in qdev_walk_children qdev.c:567:15
 #16 in qbus_walk_children bus.c:52:15
 #17 in qemu_devices_reset reset.c:69:9
 #18 in pc_machine_reset pc.c:2628:5
 #19 in qemu_system_reset vl.c:1621:9
 #20 in LLVMFuzzerTestOneInput fuzz.c:184:4

Signed-off-by: Alexander Oleinik 
---
 hw/net/virtio-net.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index b9e1cd71cf..743f1c827c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -2074,12 +2074,15 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 out_sg = sg;
 }
 
-ret = qemu_sendv_packet_async(qemu_get_subqueue(n->nic, queue_index),
-  out_sg, out_num, virtio_net_tx_complete);
-if (ret == 0) {
-virtio_queue_set_notification(q->tx_vq, 0);
-q->async_tx.elem = elem;
-return -EBUSY;
+/* Do not try to send 0-sized packets */
+if (iov_size(out_sg, out_num)) {
+ret = qemu_sendv_packet_async(qemu_get_subqueue(n->nic,
+queue_index), out_sg, out_num, virtio_net_tx_complete);
+if (ret == 0) {
+virtio_queue_set_notification(q->tx_vq, 0);
+q->async_tx.elem = elem;
+return -EBUSY;
+}
 }
 
 drop:
-- 
2.20.1




Re: [Qemu-devel] [PATCH] migration: equation is more proper than and to check LOADVM_QUIT

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> LOADVM_QUIT allows a command to quit all layers of nested loadvm loops,
> while current return value check is not that proper even it works now.
> 
> Current return value check "ret & LOADVM_QUIT" would return true if
> bit[0] is 1. This would be true when ret is -1 which is used to indicate
> an error of handling a command.
> 
> Since there is only one place return LOADVM_QUIT and no other
> combination of return value, use "ret == LOADVM_QUIT" would be more
> proper.

Yes, when I first wrote this it was more complex with a few flags, and
they all got removed.

> Signed-off-by: Wei Yang 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/savevm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 1a9b1f411e..25fe7ea05a 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -2429,7 +2429,7 @@ retry:
>  case QEMU_VM_COMMAND:
>  ret = loadvm_process_command(f);
>  trace_qemu_loadvm_state_section_command(ret);
> -if ((ret < 0) || (ret & LOADVM_QUIT)) {
> +if ((ret < 0) || (ret == LOADVM_QUIT)) {
>  goto out;
>  }
>  break;
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH] migration: use migration_in_postcopy() to check POSTCOPY_ACTIVE

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> Use common helper function to check the state.
> 
> Signed-off-by: Wei Yang 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/rdma.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 3036221ee8..0e73e759ca 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -3140,7 +3140,7 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void 
> *opaque,
>  
>  CHECK_ERROR_STATE();
>  
> -if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> +if (migration_in_postcopy()) {
>  rcu_read_unlock();
>  return RAM_SAVE_CONTROL_NOT_SUPP;
>  }
> @@ -3775,7 +3775,7 @@ static int qemu_rdma_registration_start(QEMUFile *f, 
> void *opaque,
>  
>  CHECK_ERROR_STATE();
>  
> -if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> +if (migration_in_postcopy()) {
>  rcu_read_unlock();
>  return 0;
>  }
> @@ -3810,7 +3810,7 @@ static int qemu_rdma_registration_stop(QEMUFile *f, 
> void *opaque,
>  
>  CHECK_ERROR_STATE();
>  
> -if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> +if (migration_in_postcopy()) {
>  rcu_read_unlock();
>  return 0;
>  }
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH] scsi-generic: Check sense key before request snooping and patching

2019-07-19 Thread Dmitry Fomichev
Paolo,

I've tested this version of the patch and it works fine.
Indeed, with this cleanup, the code is more straightforward and robust.

Dmitry

-Original Message-
From: Qemu-devel  On 
Behalf Of Paolo Bonzini
Sent: Friday, July 19, 2019 4:17 AM
To: qemu-devel@nongnu.org
Cc: Shinichiro Kawasaki 
Subject: [Qemu-devel] [PATCH] scsi-generic: Check sense key before request 
snooping and patching

From: Shin'ichiro Kawasaki 

When READ CAPACITY command completes, scsi_read_complete() function
snoops the command result and updates SCSIDevice members blocksize and
max_lba . However, this update is executed even when READ CAPACITY
command indicates an error in sense data. This causes unexpected
blocksize update with zero value for SCSI devices without
READ CAPACITY(10) command support and eventually results in a divide
by zero. An emulated device by TCMU-runner is an example of a device
that doesn't support READ CAPACITY(10) command.

To avoid the unexpected update, add sense key check in
scsi_read_complete() function. The function already checks the sense key
for VPD Block Limits emulation. Do the scsi_parse_sense_buf() call for
all requests rather than just for VPD Block Limits emulation, so that
blocksize and max_lba are only updated if READ CAPACITY returns zero
sense key.

Signed-off-by: Shin'ichiro Kawasaki 
[Extend the check to all requests, not just READ CAPACITY]
Signed-off-by: Paolo Bonzini 
---
 hw/scsi/scsi-generic.c | 28 
 1 file changed, 16 insertions(+), 12 deletions(-)

diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index f07891b3f6..c11a0c9a84 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -254,24 +254,28 @@ static void scsi_read_complete(void * opaque, int ret)
 
 r->len = -1;
 
-/*
- * Check if this is a VPD Block Limits request that
- * resulted in sense error but would need emulation.
- * In this case, emulate a valid VPD response.
- */
-if (s->needs_vpd_bl_emulation && ret == 0 &&
-(r->io_header.driver_status & SG_ERR_DRIVER_SENSE) &&
-r->req.cmd.buf[0] == INQUIRY &&
-(r->req.cmd.buf[1] & 0x01) &&
-r->req.cmd.buf[2] == 0xb0) {
+if (r->io_header.driver_status & SG_ERR_DRIVER_SENSE) {
 SCSISense sense =
 scsi_parse_sense_buf(r->req.sense, r->io_header.sb_len_wr);
-if (sense.key == ILLEGAL_REQUEST) {
+
+/*
+ * Check if this is a VPD Block Limits request that
+ * resulted in sense error but would need emulation.
+ * In this case, emulate a valid VPD response.
+ */
+if (sense.key == ILLEGAL_REQUEST &&
+s->needs_vpd_bl_emulation &&
+r->req.cmd.buf[0] == INQUIRY &&
+(r->req.cmd.buf[1] & 0x01) &&
+r->req.cmd.buf[2] == 0xb0) {
 len = scsi_generic_emulate_block_limits(r, s);
 /*
- * No need to let scsi_read_complete go on and handle an
+ * It's okay to jup to req_complete: no need to
+ * let scsi_handle_inquiry_reply handle an
  * INQUIRY VPD BL request we created manually.
  */
+}
+if (sense.key) {
 goto req_complete;
 }
 }
-- 
2.21.0




[Qemu-devel] [PATCH v1 1/1] riscv/boot: Fixup the RISC-V firmware warning

2019-07-19 Thread Alistair Francis
Fix a typo in the warning message displayed to users, don't print the
message when running inside qtest and don't mention a specific QEMU
version for the deprecation.

Signed-off-by: Alistair Francis 
---
 hw/riscv/boot.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/boot.c b/hw/riscv/boot.c
index 5dee63011b..4b32ab1d26 100644
--- a/hw/riscv/boot.c
+++ b/hw/riscv/boot.c
@@ -26,6 +26,7 @@
 #include "hw/riscv/boot.h"
 #include "hw/boards.h"
 #include "elf.h"
+#include "sysemu/qtest.h"
 
 #if defined(TARGET_RISCV32)
 # define KERNEL_BOOT_ADDRESS 0x8040
@@ -46,10 +47,13 @@ void riscv_find_and_load_firmware(MachineState *machine,
  * In the future this defaul will change to loading the prebuilt
  * OpenSBI firmware. Let's warn the user and then continue.
 */
-warn_report("No -bios option specified. Not loading a firmware.");
-warn_report("This default will change in QEMU 4.3. Please use the " \
-"-bios option to aviod breakages when this happens.");
-warn_report("See QEMU's deprecation documentation for details");
+if (!qtest_enabled()) {
+warn_report("No -bios option specified. Not loading a firmware.");
+warn_report("This default will change in a future QEMU release. " \
+"Please use the -bios option to avoid breakages when "\
+"this happens.");
+warn_report("See QEMU's deprecation documentation for details");
+}
 return;
 }
 
-- 
2.22.0




Re: [Qemu-devel] [PATCH] migration: check length directly to make sure the range is aligned

2019-07-19 Thread Dr. David Alan Gilbert
* Paolo Bonzini (pbonz...@redhat.com) wrote:
> On 19/07/19 19:54, Dr. David Alan Gilbert wrote:
> >> -if ((uintptr_t)host_endaddr & (rb->page_size - 1)) {
> >> -error_report("ram_block_discard_range: Unaligned end address: 
> >> %p",
> >> - host_endaddr);
> >> +if (length & (rb->page_size - 1)) {
> >> +error_report("ram_block_discard_range: Unaligned length: %lx",
> >> + length);
> > Yes, I *think* this is safe, we'll need to watch out for any warnings;
> 
> Do you mean compiler or QEMU warning?

No, I mean lots of these error reports being printed out in some common
case.

Dave

  The patch is safe since there's an
> 
> if ((uintptr_t)host_startaddr & (rb->page_size - 1)) {
> error_report("ram_block_discard_range: Unaligned start address: %p",
>  host_startaddr);
> goto err;
> }
> 
> just before this context.
> 
> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH] migration: remove unused field bytes_xfer

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> On Tue, Jun 11, 2019 at 10:33:29AM +0200, Juan Quintela wrote:
> >Wei Yang  wrote:
> >> On Tue, Apr 02, 2019 at 08:31:06AM +0800, Wei Yang wrote:
> >>>MigrationState->bytes_xfer is only set to 0 in migrate_init().
> >>>
> >>>Remove this unnecessary field.
> >>>
> >>>Signed-off-by: Wei Yang 
> >>
> >> Hi, David
> >
> >Hi
> >
> >I am on duty this week, will get it.
> >
> 
> Hi, Juan
> 
> Sounds we lost this one and the one above this :-)
> 

We're in freeze at the moment; we'll pick it up as soon
as it opens up again; I've got quite a list of your clean-up patches!

Dave

> -- 
> Wei Yang
> Help you, Help me
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH] migration: check length directly to make sure the range is aligned

2019-07-19 Thread Paolo Bonzini
On 19/07/19 19:54, Dr. David Alan Gilbert wrote:
>> -if ((uintptr_t)host_endaddr & (rb->page_size - 1)) {
>> -error_report("ram_block_discard_range: Unaligned end address: 
>> %p",
>> - host_endaddr);
>> +if (length & (rb->page_size - 1)) {
>> +error_report("ram_block_discard_range: Unaligned length: %lx",
>> + length);
> Yes, I *think* this is safe, we'll need to watch out for any warnings;

Do you mean compiler or QEMU warning?  The patch is safe since there's an

if ((uintptr_t)host_startaddr & (rb->page_size - 1)) {
error_report("ram_block_discard_range: Unaligned start address: %p",
 host_startaddr);
goto err;
}

just before this context.

Paolo



Re: [Qemu-devel] [PATCH] migration: check length directly to make sure the range is aligned

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> Since the start addr is already checked, to make sure the range is
> aligned, checking the length is enough.
> 
> Signed-off-by: Wei Yang 
> ---
>  exec.c | 7 +++
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index 50ea9c5aaa..8fa980baae 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -4067,10 +4067,9 @@ int ram_block_discard_range(RAMBlock *rb, uint64_t 
> start, size_t length)
>  
>  if ((start + length) <= rb->used_length) {
>  bool need_madvise, need_fallocate;
> -uint8_t *host_endaddr = host_startaddr + length;
> -if ((uintptr_t)host_endaddr & (rb->page_size - 1)) {
> -error_report("ram_block_discard_range: Unaligned end address: 
> %p",
> - host_endaddr);
> +if (length & (rb->page_size - 1)) {
> +error_report("ram_block_discard_range: Unaligned length: %lx",
> + length);

Yes, I *think* this is safe, we'll need to watch out for any warnings;
David Gibson's balloon fix from February means that the balloon code
should now warn in it's case.

rth: Want to pick this up?

Reviewed-by: Dr. David Alan Gilbert 

>  goto err;
>  }
>  
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [Bug 1837218] Re: qemu segfaults after spice update with bochs-display

2019-07-19 Thread post-factum
I've built qemu v4.1.0-rc1 with debug symbols, but got no luck in
reproducing this.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1837218

Title:
  qemu segfaults after spice update with bochs-display

Status in QEMU:
  New

Bug description:
  Description:

  qemu segfaults after latest spice update with bochs-display.
  Downgrading spice solves the issue. Switching to qxl-vga and/or
  virtio-gpu also works even with new spice.

  Additional info:
  * package version(s)

  spice 0.14.2-1 (0.14.0 is unaffected)
  qemu-headless 4.0.0-3

  * config and/or log files etc.

  pf@defiant:~ » /mnt/vms/02-archlinux/start.sh
  /mnt/vms/02-archlinux/start.sh: line 41: 13501 Segmentation fault (core 
dumped) qemu-system-x86_64 -name "${NAME}" -display none -spice 
ipv4,addr=127.0.0.1,port=270${ID},disable-ticketing,disable-copy-paste,disable-agent-file-xfer,agent-mouse=off
 -serial mon:telnet:127.0.0.1:280${ID},server,nowait,nodelay -gdb tcp::260${ID} 
-nodefaults -machine q35,accel=kvm -cpu max -smp 
cores=${CPU},threads=1,sockets=1 -m ${MEM} -drive 
if=pflash,format=raw,readonly,file="${BIOS}" -drive 
if=pflash,format=raw,file="${VARS}" -device virtio-rng -device bochs-display 
-device virtio-keyboard -netdev bridge,id=bridge.0,br=vm0 -device 
virtio-net,mac=${_MAC}:01,netdev=bridge.0,mq=on,vectors=${_VECTORS} -fsdev 
local,id="${NAME}",path="${SHARED}",security_model=mapped,writeout=immediate 
-device virtio-9p-pci,fsdev="${NAME}",mount_tag="shared" -device 
virtio-scsi,id=scsi,num_queues=${CPU},vectors=${_VECTORS} -device 
scsi-hd,drive=hd1 -drive 
if=none,media=disk,id=hd1,file="${DISK1}",format=raw,cache=directsync,discard=unmap,detect-zeroes=unmap
 -device scsi-hd,drive=hd2 -drive 
if=none,media=disk,id=hd2,file="${DISK2}",format=raw,cache=directsync,discard=unmap,detect-zeroes=unmap
 -device scsi-cd,drive=cd1 -drive 
if=none,media=cdrom,id=cd1,file="${CDROM1}",format=raw,cache=directsync

  Steps to reproduce:

  Update spice, launch a VM like the above and observe a segfault.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1837218/+subscriptions



Re: [Qemu-devel] [PATCH] migration/postcopy: use static PostcopyDiscardState instead of allocating it for each block

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> Even we need to do discard for each RAMBlock, we still can leverage the
> same memory space to store the information.
> 
> By doing so, we avoid memory allocation and deallocation to the system
> and also avoid potential failure of memory allocation which breaks the
> migration.
> 
> Signed-off-by: Wei Yang 
> ---
>  migration/postcopy-ram.c | 16 +++-
>  1 file changed, 7 insertions(+), 9 deletions(-)
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 9faacacc9e..2e6b076bb7 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1377,8 +1377,7 @@ void 
> postcopy_fault_thread_notify(MigrationIncomingState *mis)
>   *   asking to discard individual ranges.
>   *
>   * @ms: The current migration state.
> - * @offset: the bitmap offset of the named RAMBlock in the migration
> - *   bitmap.
> + * @offset: the bitmap offset of the named RAMBlock in the migration bitmap.
>   * @name: RAMBlock that discards will operate on.
>   *
>   * returns: a new PDS.
> @@ -1386,13 +1385,14 @@ void 
> postcopy_fault_thread_notify(MigrationIncomingState *mis)
>  PostcopyDiscardState *postcopy_discard_send_init(MigrationState *ms,
>   const char *name)
>  {
> -PostcopyDiscardState *res = g_malloc0(sizeof(PostcopyDiscardState));
> +static PostcopyDiscardState res = {0};

Do you think it would be better to make this a static at the top of
migration/postcopy-ram.c and then we could remove the pds parameters
from postcopy_discard_send_range and friends?
If there's only one pds then we don't need to pass the pointer around.

Dave

> -if (res) {
> -res->ramblock_name = name;
> -}
> +res.ramblock_name = name;
> +res.cur_entry = 0;
> +res.nsentwords = 0;
> +res.nsentcmds = 0;
>  
> -return res;
> +return 
>  }
>  
>  /**
> @@ -1449,8 +1449,6 @@ void postcopy_discard_send_finish(MigrationState *ms, 
> PostcopyDiscardState *pds)
>  
>  trace_postcopy_discard_send_finish(pds->ramblock_name, pds->nsentwords,
> pds->nsentcmds);
> -
> -g_free(pds);
>  }
>  
>  /*
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH for-4.1 v2] nbd: Initialize reply on failure

2019-07-19 Thread Philippe Mathieu-Daudé
On 7/19/19 7:20 PM, Eric Blake wrote:
> We've had two separate reports of different callers running into use
> of uninitialized data if s->quit is set (one detected by gcc -O3,
> another by valgrind), due to checking 'nbd_reply_is_simple(reply) ||
> s->quit' in the wrong order. Rather than chasing down which callers
> need to pre-initialize reply, and whether there are any other
> uninitialized uses, it's easier to guarantee that reply will always be
> set by nbd_co_receive_one_chunk() even on failure.
> 
> The uninitialized use happens to be harmless (the only time the
> variable is uninitialized is if s->quit is set, so the conditional
> results in the same action regardless of what was read from reply),
> and was introduced in commit 65e01d47.
> 
> In fixing the problem, it can also be seen that all (one) callers pass
> in a non-NULL reply, so there is a dead condtional to also be cleaned

"conditional"

> up.
> 
> Reported-by: Thomas Huth 
> Reported-by: Andrey Shinkevich 
> Signed-off-by: Eric Blake 
> ---
>  block/nbd.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index 81edabbf35ed..57c1a205811a 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -640,12 +640,11 @@ static coroutine_fn int nbd_co_receive_one_chunk(
>request_ret, qiov, payload, errp);
> 
>  if (ret < 0) {
> +memset(reply, 0, sizeof(*reply));
>  s->quit = true;
>  } else {
>  /* For assert at loop start in nbd_connection_entry */
> -if (reply) {
> -*reply = s->reply;
> -}
> +*reply = s->reply;
>  s->reply.handle = 0;
>  }
> 

Reviewed-by: Philippe Mathieu-Daudé 



Re: [Qemu-devel] [PATCH for-4.1] target/arm: Limit ID register assertions to TCG

2019-07-19 Thread Philippe Mathieu-Daudé
On 7/18/19 2:59 PM, Peter Maydell wrote:
> In arm_cpu_realizefn() we make several assertions about the values of
> guest ID registers:
>  * if the CPU provides AArch32 v7VE or better it must advertise the
>ARM_DIV feature
>  * if the CPU provides AArch32 A-profile v6 or better it must
>advertise the Jazelle feature
> 
> These are essentially consistency checks that our ID register
> specifications in cpu.c didn't accidentally miss out a feature,
> because increasingly the TCG emulation gates features on the values
> in ID registers rather than using old-style checks of ARM_FEATURE_FOO
> bits.
> 
> Unfortunately, these asserts can cause problems if we're running KVM,
> because in that case we don't control the values of the ID registers
> -- we read them from the host kernel.  In particular, if the host
> kernel is older than 4.15 then it doesn't expose the ID registers via
> the KVM_GET_ONE_REG ioctl, and we set up dummy values for some
> registers and leave the rest at zero.  (See the comment in
> target/arm/kvm64.c kvm_arm_get_host_cpu_features().) This set of
> dummy values is not sufficient to pass our assertions, and so on
> those kernels running an AArch32 guest on AArch64 will assert.
> 
> We could provide a more sophisticated set of dummy ID registers in
> this case, but that still leaves the possibility of a host CPU which
> reports bogus ID register values that would cause us to assert.  It's
> more robust to only do these ID register checks if we're using TCG,
> as that is the only case where this is truly a QEMU code bug.

Agreed, this is clever and simpler.

> 
> Reported-by: Laszlo Ersek 
> Fixes: https://bugs.launchpad.net/qemu/+bug/1830864
> Signed-off-by: Peter Maydell 
> ---
> Laszlo, would you mind testing this on your setup? I don't have
> a system with an old enough kernel to trigger the assert. (The
> change is pretty much a "has to work" one though :-))
> 
>  target/arm/cpu.c | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/cpu.c b/target/arm/cpu.c
> index 1959467fdc8..9eb40ff755f 100644
> --- a/target/arm/cpu.c
> +++ b/target/arm/cpu.c
> @@ -1369,6 +1369,9 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>   * There exist AArch64 cpus without AArch32 support.  When KVM
>   * queries ID_ISAR0_EL1 on such a host, the value is UNKNOWN.
>   * Similarly, we cannot check ID_AA64PFR0 without AArch64 support.
> + * As a general principle, we also do not make ID register
> + * consistency checks anywhere unless using TCG, because only
> + * for TCG would a consistency-check failure be a QEMU bug.
>   */
>  if (arm_feature(>env, ARM_FEATURE_AARCH64)) {
>  no_aa32 = !cpu_isar_feature(aa64_aa32, cpu);
> @@ -1383,7 +1386,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>   * Presence of EL2 itself is ARM_FEATURE_EL2, and of the
>   * Security Extensions is ARM_FEATURE_EL3.
>   */
> -assert(no_aa32 || cpu_isar_feature(arm_div, cpu));
> +assert(!tcg_enabled() || no_aa32 || cpu_isar_feature(arm_div, cpu));
>  set_feature(env, ARM_FEATURE_LPAE);
>  set_feature(env, ARM_FEATURE_V7);
>  }
> @@ -1409,7 +1412,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error 
> **errp)
>  if (arm_feature(env, ARM_FEATURE_V6)) {
>  set_feature(env, ARM_FEATURE_V5);
>  if (!arm_feature(env, ARM_FEATURE_M)) {
> -assert(no_aa32 || cpu_isar_feature(jazelle, cpu));
> +assert(!tcg_enabled() || no_aa32 || cpu_isar_feature(jazelle, 
> cpu));
>  set_feature(env, ARM_FEATURE_AUXCR);
>  }
>  }
> 

Reviewed-by: Philippe Mathieu-Daudé 



[Qemu-devel] [PATCH for-4.1 v2] nbd: Initialize reply on failure

2019-07-19 Thread Eric Blake
We've had two separate reports of different callers running into use
of uninitialized data if s->quit is set (one detected by gcc -O3,
another by valgrind), due to checking 'nbd_reply_is_simple(reply) ||
s->quit' in the wrong order. Rather than chasing down which callers
need to pre-initialize reply, and whether there are any other
uninitialized uses, it's easier to guarantee that reply will always be
set by nbd_co_receive_one_chunk() even on failure.

The uninitialized use happens to be harmless (the only time the
variable is uninitialized is if s->quit is set, so the conditional
results in the same action regardless of what was read from reply),
and was introduced in commit 65e01d47.

In fixing the problem, it can also be seen that all (one) callers pass
in a non-NULL reply, so there is a dead condtional to also be cleaned
up.

Reported-by: Thomas Huth 
Reported-by: Andrey Shinkevich 
Signed-off-by: Eric Blake 
---
 block/nbd.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/nbd.c b/block/nbd.c
index 81edabbf35ed..57c1a205811a 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -640,12 +640,11 @@ static coroutine_fn int nbd_co_receive_one_chunk(
   request_ret, qiov, payload, errp);

 if (ret < 0) {
+memset(reply, 0, sizeof(*reply));
 s->quit = true;
 } else {
 /* For assert at loop start in nbd_connection_entry */
-if (reply) {
-*reply = s->reply;
-}
+*reply = s->reply;
 s->reply.handle = 0;
 }

-- 
2.20.1




Re: [Qemu-devel] [PATCH] nbd: Initialize reply on failure

2019-07-19 Thread Philippe Mathieu-Daudé
On 7/19/19 7:15 PM, Eric Blake wrote:
> On 7/19/19 10:03 AM, Eric Blake wrote:
>> We've had two separate reports of a caller running into use of
>> uninitialized data if s->quit is set (one detected by gcc -O3, another
>> by valgrind), due to checking 'nbd_reply_is_simple(reply) || s->quit'
>> in the wrong order. Rather than chasing down which callers need to
>> pre-initialize reply, it's easier to guarantee that reply will always
>> be set by nbd_co_receive_one_chunk() even on failure.
>>
>> Reported-by: Thomas Huth 
>> Reported-by: Andrey Shinkevich 
>> Signed-off-by: Eric Blake 
>> ---
>>
> 
> Blech. Needs a v2.  Expanding context:
> 
> 
>> +++ b/block/nbd.c
>> @@ -640,6 +640,7 @@ static coroutine_fn int nbd_co_receive_one_chunk(
>>request_ret, qiov, payload, errp);
>>
>>  if (ret < 0) {
>> +memset(reply, 0, sizeof *reply);
>>  s->quit = true;
>>  } else {
>>  /* For assert at loop start in nbd_connection_entry */
> if (reply) {
> *reply = s->reply;
> }
> 
> either callers can pass in reply==NULL (in which case the memset()
> dereferences NULL, oops), or always pass in non-NULL reply (in which

Oh good catch...

> case the null check is dead code).
> 



Re: [Qemu-devel] [PATCH] nbd: Initialize reply on failure

2019-07-19 Thread Eric Blake
On 7/19/19 10:03 AM, Eric Blake wrote:
> We've had two separate reports of a caller running into use of
> uninitialized data if s->quit is set (one detected by gcc -O3, another
> by valgrind), due to checking 'nbd_reply_is_simple(reply) || s->quit'
> in the wrong order. Rather than chasing down which callers need to
> pre-initialize reply, it's easier to guarantee that reply will always
> be set by nbd_co_receive_one_chunk() even on failure.
> 
> Reported-by: Thomas Huth 
> Reported-by: Andrey Shinkevich 
> Signed-off-by: Eric Blake 
> ---
> 

Blech. Needs a v2.  Expanding context:


> +++ b/block/nbd.c
> @@ -640,6 +640,7 @@ static coroutine_fn int nbd_co_receive_one_chunk(
>request_ret, qiov, payload, errp);
> 
>  if (ret < 0) {
> +memset(reply, 0, sizeof *reply);
>  s->quit = true;
>  } else {
>  /* For assert at loop start in nbd_connection_entry */
if (reply) {
*reply = s->reply;
}

either callers can pass in reply==NULL (in which case the memset()
dereferences NULL, oops), or always pass in non-NULL reply (in which
case the null check is dead code).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] nbd: Initialize reply on failure

2019-07-19 Thread Eric Blake
On 7/19/19 10:03 AM, Eric Blake wrote:
> We've had two separate reports of a caller running into use of
> uninitialized data if s->quit is set (one detected by gcc -O3, another
> by valgrind), due to checking 'nbd_reply_is_simple(reply) || s->quit'
> in the wrong order. Rather than chasing down which callers need to
> pre-initialize reply, it's easier to guarantee that reply will always
> be set by nbd_co_receive_one_chunk() even on failure.
> 

I'm adding:

The bug is harmless (the only time uninitialized use is possible is if
s->quit is set, so the conditional resolves to the same branch
regardless of the contents of reply), but was introduced in commit 65e01d47.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PULL 00/19] Bugfix/cleanup patches for 2019-07-16

2019-07-19 Thread Paolo Bonzini
On 19/07/19 18:15, Peter Maydell wrote:
> Hi Paolo -- it looks like this may have broken the
> travis config "--without-default-devices":

Alex Bennée already has a queued patch for this (and I did too but I
removed it in favor of his).  You can apply it directly from message-id
<20190717134335.15351-18-alex.ben...@linaro.org>.

Paolo

> Here's a sample failing build:
> https://travis-ci.org/qemu/qemu/jobs/559509325
> 
> minikconf barfs with "contradiction between clauses"
> 
> Traceback (most recent call last):
>   File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 703,
> in 
> config = data.compute_config()
>   File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 253,
> in compute_config
> clause.process()
>   File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 188, in 
> process
> self.dest.set_value(False, self)
>   File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 118,
> in set_value
> raise KconfigDataError('contradiction between clauses when setting
> %s' % self)
> __main__.KconfigDataError: contradiction between clauses when setting VMMOUSE
> 
> 
> I guess this is Julio's commit 97fd1ea8c10658?
> 
> thanks
> -- PMM
> 




Re: [Qemu-devel] [PATCH 3/3] migration/savevm: move non SaveStateEntry condition check out of iteration

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> in_postcopy and iterable_only are not SaveStateEntry specific, it would
> be more proper to check them out of iteration.
> 
> Signed-off-by: Wei Yang 

Worth it just to make that big if statement simpler!


Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/savevm.c | 13 ++---
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index c41e13e322..8a2ada529e 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1247,8 +1247,7 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
>  }
>  
>  static
> -int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool 
> in_postcopy,
> -bool iterable_only)
> +int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool 
> in_postcopy)
>  {
>  SaveStateEntry *se;
>  int ret;
> @@ -1257,7 +1256,6 @@ int 
> qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy,
>  if (!se->ops ||
>  (in_postcopy && se->ops->has_postcopy &&
>   se->ops->has_postcopy(se->opaque)) ||
> -(in_postcopy && !iterable_only) ||
>  !se->ops->save_live_complete_precopy) {
>  continue;
>  }
> @@ -1369,10 +1367,11 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
> bool iterable_only,
>  
>  cpu_synchronize_all_states();
>  
> -ret = qemu_savevm_state_complete_precopy_iterable(f, in_postcopy,
> -  iterable_only);
> -if (ret) {
> -return ret;
> +if (!in_postcopy || iterable_only) {
> +ret = qemu_savevm_state_complete_precopy_iterable(f, in_postcopy);
> +if (ret) {
> +return ret;
> +}
>  }
>  
>  if (iterable_only) {
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 2/3] migration/savevm: split qemu_savevm_state_complete_precopy() into two parts

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> This is a preparation patch for further cleanup.
> 
> No functional change, just wrap two major part of
> qemu_savevm_state_complete_precopy() into function.
> 
> Signed-off-by: Wei Yang 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/savevm.c | 66 ++
>  1 file changed, 49 insertions(+), 17 deletions(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index becedcc1c6..c41e13e322 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1246,23 +1246,12 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
>  qemu_fflush(f);
>  }
>  
> -int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> -   bool inactivate_disks)
> +static
> +int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool 
> in_postcopy,
> +bool iterable_only)
>  {
> -QJSON *vmdesc;
> -int vmdesc_len;
>  SaveStateEntry *se;
>  int ret;
> -bool in_postcopy = migration_in_postcopy();
> -Error *local_err = NULL;
> -
> -if (precopy_notify(PRECOPY_NOTIFY_COMPLETE, _err)) {
> -error_report_err(local_err);
> -}
> -
> -trace_savevm_state_complete_precopy();
> -
> -cpu_synchronize_all_states();
>  
>  QTAILQ_FOREACH(se, _state.handlers, entry) {
>  if (!se->ops ||
> @@ -1291,9 +1280,18 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
> bool iterable_only,
>  }
>  }
>  
> -if (iterable_only) {
> -goto flush;
> -}
> +return 0;
> +}
> +
> +static
> +int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
> +bool in_postcopy,
> +bool inactivate_disks)
> +{
> +QJSON *vmdesc;
> +int vmdesc_len;
> +SaveStateEntry *se;
> +int ret;
>  
>  vmdesc = qjson_new();
>  json_prop_int(vmdesc, "page_size", qemu_target_page_size());
> @@ -1353,6 +1351,40 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
> bool iterable_only,
>  }
>  qjson_destroy(vmdesc);
>  
> +return 0;
> +}
> +
> +int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
> +   bool inactivate_disks)
> +{
> +int ret;
> +Error *local_err = NULL;
> +bool in_postcopy = migration_in_postcopy();
> +
> +if (precopy_notify(PRECOPY_NOTIFY_COMPLETE, _err)) {
> +error_report_err(local_err);
> +}
> +
> +trace_savevm_state_complete_precopy();
> +
> +cpu_synchronize_all_states();
> +
> +ret = qemu_savevm_state_complete_precopy_iterable(f, in_postcopy,
> +  iterable_only);
> +if (ret) {
> +return ret;
> +}
> +
> +if (iterable_only) {
> +goto flush;
> +}
> +
> +ret = qemu_savevm_state_complete_precopy_non_iterable(f, in_postcopy,
> +  inactivate_disks);
> +if (ret) {
> +return ret;
> +}
> +
>  flush:
>  qemu_fflush(f);
>  return 0;
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 1/3] migration/savevm: flush file for iterable_only case

2019-07-19 Thread Dr. David Alan Gilbert
* Wei Yang (richardw.y...@linux.intel.com) wrote:
> It would be proper to flush file even for iterable_only case.
> 
> Signed-off-by: Wei Yang 

OK, I don't think this is actually necessary; but it's safe.
We only really need the flush at the end of the file, and in the
non-iterable-only case it's not the end of the file.

Since it makes your next change simpler,

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  migration/savevm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index c0e557b4c2..becedcc1c6 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1292,7 +1292,7 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
> bool iterable_only,
>  }
>  
>  if (iterable_only) {
> -return 0;
> +goto flush;
>  }
>  
>  vmdesc = qjson_new();
> @@ -1353,6 +1353,7 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, 
> bool iterable_only,
>  }
>  qjson_destroy(vmdesc);
>  
> +flush:
>  qemu_fflush(f);
>  return 0;
>  }
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v2 3/4] hw/arm/boot: Avoid placing the initrd on top of the kernel

2019-07-19 Thread Mark Rutland
Hi Peter,

I've just been testing on QEMU v4.1.0-rc1, and found a case where the
DTB overlapped the end of the kernel, and I think there's a bug in this
patch -- explanation below.

On Thu, May 16, 2019 at 03:47:32PM +0100, Peter Maydell wrote:
> We currently put the initrd at the smaller of:
>  * 128MB into RAM
>  * halfway into the RAM
> (with the dtb following it).
> 
> However for large kernels this might mean that the kernel
> overlaps the initrd. For some kinds of kernel (self-decompressing
> 32-bit kernels, and ELF images with a BSS section at the end)
> we don't know the exact size, but even there we have a
> minimum size. Put the initrd at least further into RAM than
> that. For image formats that can give us an exact kernel size, this
> will mean that we definitely avoid overlaying kernel and initrd.
> 
> Signed-off-by: Peter Maydell 
> ---
>  hw/arm/boot.c | 34 --
>  1 file changed, 20 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index 935be3b92a5..e441393fdf5 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -999,20 +999,6 @@ static void arm_setup_direct_kernel_boot(ARMCPU *cpu,
>  if (info->nb_cpus == 0)
>  info->nb_cpus = 1;
>  
> -/*
> - * We want to put the initrd far enough into RAM that when the
> - * kernel is uncompressed it will not clobber the initrd. However
> - * on boards without much RAM we must ensure that we still leave
> - * enough room for a decent sized initrd, and on boards with large
> - * amounts of RAM we must avoid the initrd being so far up in RAM
> - * that it is outside lowmem and inaccessible to the kernel.
> - * So for boards with less  than 256MB of RAM we put the initrd
> - * halfway into RAM, and for boards with 256MB of RAM or more we put
> - * the initrd at 128MB.
> - */
> -info->initrd_start = info->loader_start +
> -MIN(info->ram_size / 2, 128 * 1024 * 1024);
> -
>  /* Assume that raw images are linux kernels, and ELF images are not.  */
>  kernel_size = arm_load_elf(info, _entry, _low_addr,
> _high_addr, elf_machine, as);
> @@ -1064,6 +1050,26 @@ static void arm_setup_direct_kernel_boot(ARMCPU *cpu,
>  }
>  
>  info->entry = entry;

Note: this is the start of the kernel image...

> +
> +/*
> + * We want to put the initrd far enough into RAM that when the
> + * kernel is uncompressed it will not clobber the initrd. However
> + * on boards without much RAM we must ensure that we still leave
> + * enough room for a decent sized initrd, and on boards with large
> + * amounts of RAM we must avoid the initrd being so far up in RAM
> + * that it is outside lowmem and inaccessible to the kernel.
> + * So for boards with less  than 256MB of RAM we put the initrd
> + * halfway into RAM, and for boards with 256MB of RAM or more we put
> + * the initrd at 128MB.
> + * We also refuse to put the initrd somewhere that will definitely
> + * overlay the kernel we just loaded, though for kernel formats which
> + * don't tell us their exact size (eg self-decompressing 32-bit kernels)
> + * we might still make a bad choice here.
> + */
> +info->initrd_start = info->loader_start +
> +MAX(MIN(info->ram_size / 2, 128 * 1024 * 1024), kernel_size);

... but here we add kernel_size to the start of the loader, which is
below the kernel. Should that be info->entry?

I've seen this trigger a case where:

* The kernel's image_size is 0x0a7a8000
* The kernel was loaded at   0x4008
* The end of the kernel is   0x4A828000
* The DTB was loaded at  0x4a80

... and the kernel is unable to find a usable DTB.

Thanks,
Mark.

> +info->initrd_start = TARGET_PAGE_ALIGN(info->initrd_start);
> +
>  if (is_linux) {
>  uint32_t fixupcontext[FIXUP_MAX];
>  
> -- 
> 2.20.1
> 



[Qemu-devel] [PATCH v5 5/6] iotests: extended timeout under Valgrind

2019-07-19 Thread Andrey Shinkevich
As the iotests run longer under the Valgrind, the QEMU_COMM_TIMEOUT is
to be increased in the test cases 028, 183 and 192 when running under
the Valgrind.

Suggested-by: Roman Kagan 
Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/028 | 6 +-
 tests/qemu-iotests/183 | 9 -
 tests/qemu-iotests/192 | 6 +-
 3 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/tests/qemu-iotests/028 b/tests/qemu-iotests/028
index 01f4959..71301ec 100755
--- a/tests/qemu-iotests/028
+++ b/tests/qemu-iotests/028
@@ -110,7 +110,11 @@ echo
 qemu_comm_method="monitor"
 _launch_qemu -drive file="${TEST_IMG}",cache=${CACHEMODE},id=disk
 h=$QEMU_HANDLE
-QEMU_COMM_TIMEOUT=1
+if [ "${VALGRIND_QEMU}" == "y" ]; then
+QEMU_COMM_TIMEOUT=7
+else
+QEMU_COMM_TIMEOUT=1
+fi
 
 # Silence output since it contains the disk image path and QEMU's readline
 # character echoing makes it very hard to filter the output. Plus, there
diff --git a/tests/qemu-iotests/183 b/tests/qemu-iotests/183
index fbe5a99..04fb344 100755
--- a/tests/qemu-iotests/183
+++ b/tests/qemu-iotests/183
@@ -94,8 +94,15 @@ if echo "$reply" | grep "compiled without old-style" > 
/dev/null; then
 _notrun "migrate -b support not compiled in"
 fi
 
-QEMU_COMM_TIMEOUT=0.1 qemu_cmd_repeat=50 silent=yes \
+timeout_comm=$QEMU_COMM_TIMEOUT
+if [ "${VALGRIND_QEMU}" == "y" ]; then
+QEMU_COMM_TIMEOUT=4
+else
+QEMU_COMM_TIMEOUT=0.1
+fi
+qemu_cmd_repeat=50 silent=yes \
 _send_qemu_cmd $src "{ 'execute': 'query-migrate' }" '"status": 
"completed"'
+QEMU_COMM_TIMEOUT=$timeout_comm
 _send_qemu_cmd $src "{ 'execute': 'query-status' }" "return"
 
 echo
diff --git a/tests/qemu-iotests/192 b/tests/qemu-iotests/192
index 6193257..0344322 100755
--- a/tests/qemu-iotests/192
+++ b/tests/qemu-iotests/192
@@ -60,7 +60,11 @@ fi
 qemu_comm_method="monitor"
 _launch_qemu -drive $DRIVE_ARG -incoming defer
 h=$QEMU_HANDLE
-QEMU_COMM_TIMEOUT=1
+if [ "${VALGRIND_QEMU}" == "y" ]; then
+QEMU_COMM_TIMEOUT=7
+else
+QEMU_COMM_TIMEOUT=1
+fi
 
 _send_qemu_cmd $h "nbd_server_start unix:$TEST_DIR/nbd" "(qemu)"
 _send_qemu_cmd $h "nbd_server_add -w drive0" "(qemu)"
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 3/6] iotests: Add casenotrun report to bash tests

2019-07-19 Thread Andrey Shinkevich
The new function _casenotrun() is to be invoked if a test case cannot
be run for some reason. The user will be notified by a message passed
to the function.

Suggested-by: Kevin Wolf 
Signed-off-by: Andrey Shinkevich 
---
 tests/qemu-iotests/common.rc | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 6e461a1..1089050 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -428,6 +428,13 @@ _notrun()
 exit
 }
 
+# bail out, setting up .casenotrun file
+#
+_casenotrun()
+{
+echo "[case not run] $*" >>"$OUTPUT_DIR/$seq.casenotrun"
+}
+
 # just plain bail out
 #
 _fail()
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 6/6] iotests: extend sleeping time under Valgrind

2019-07-19 Thread Andrey Shinkevich
To synchronize the time when QEMU is running longer under the Valgrind,
increase the sleeping time in the test 247.

Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/247 | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/247 b/tests/qemu-iotests/247
index 546a794..c853b73 100755
--- a/tests/qemu-iotests/247
+++ b/tests/qemu-iotests/247
@@ -57,7 +57,11 @@ TEST_IMG="$TEST_IMG.4" _make_test_img $size
 {"execute":"block-commit",
  "arguments":{"device":"format-4", "top-node": "format-2", 
"base-node":"format-0", "job-id":"job0"}}
 EOF
-sleep 1
+if [ "${VALGRIND_QEMU}" == "y" ]; then
+sleep 10
+else
+sleep 1
+fi
 echo '{"execute":"quit"}'
 ) | $QEMU -qmp stdio -nographic -nodefaults \
 -blockdev file,node-name=file-0,filename=$TEST_IMG.0,auto-read-only=on \
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 0/6] Allow Valgrind checking all QEMU processes

2019-07-19 Thread Andrey Shinkevich
In the current implementation of the QEMU bash iotests, only qemu-io
processes may be run under the Valgrind, which is a useful tool for
finding memory usage issues. Let's allow the common.rc bash script
runing all the QEMU processes, such as qemu-kvm, qemu-img, qemu-ndb
and qemu-vxhs, under the Valgrind tool.

v5:
  01: The patch "block/nbd: NBDReply is used being uninitialized" was detached
  and taken into account in the patch "nbd: Initialize reply on failure"
  by Eric Blake.

v4:
  01: The patch "iotests: Set read-zeroes on in null block driver for Valgrind"
  was extended with new cases and issued as a separate series.
  02: The new patch "block/nbd: NBDReply is used being uninitialized" was
  added to resolve the failure of the iotest 083 run under Valgrind.

v3:
  01: The new function _casenotrun() was added to the common.rc bash
  script to notify the user of test cases dropped for some reason.
  Suggested by Kevin.
  Particularly, the notification about the nonexistent TMPDIR in
  the test 051 was added (noticed by Vladimir).
  02: The timeout in some test cases was extended for Valgrind because
  it differs when running on the ramdisk.
  03: Due to the common.nbd script has been changed with the commit
  b28f582c, the patch "iotests: amend QEMU NBD process synchronization"
  is actual no more. Note that QEMU_NBD is launched in the bash nested
  shell in the _qemu_nbd_wrapper() as it was before in common.rc.
  04: The patch "iotests: new file to suppress Valgrind errors" was dropped
  due to my superficial understanding of the work of the function
  blk_pread_unthrottled(). Special thanks to Kevin who shed the light
  on the null block driver involved. Now, the parameter 'read-zeroes=on'
  is passed to the null block driver to initialize the buffer in the
  function guess_disk_lchs() that the Valgrind was complaining to.

v2:
  01: The patch 2/7 of v1 was merged into the patch 1/7, suggested by Daniel.
  02: Another patch 7/7 was added to introduce the Valgrind error suppression
  file into the QEMU project.
  Discussed in the email thread with the message ID:
  <1560276131-683243-1-git-send-email-andrey.shinkev...@virtuozzo.com>

Andrey Shinkevich (6):
  iotests: allow Valgrind checking all QEMU processes
  iotests: exclude killed processes from running under  Valgrind
  iotests: Add casenotrun report to bash tests
  iotests: Valgrind fails with nonexistent directory
  iotests: extended timeout under Valgrind
  iotests: extend sleeping time under Valgrind

 tests/qemu-iotests/028   |  6 +++-
 tests/qemu-iotests/039   |  5 +++
 tests/qemu-iotests/039.out   | 30 +++--
 tests/qemu-iotests/051   |  4 +++
 tests/qemu-iotests/061   |  2 ++
 tests/qemu-iotests/061.out   | 12 ++-
 tests/qemu-iotests/137   |  1 +
 tests/qemu-iotests/137.out   |  6 +---
 tests/qemu-iotests/183   |  9 +-
 tests/qemu-iotests/192   |  6 +++-
 tests/qemu-iotests/247   |  6 +++-
 tests/qemu-iotests/common.rc | 76 +---
 12 files changed, 101 insertions(+), 62 deletions(-)

-- 
1.8.3.1




[Qemu-devel] [PATCH v5 4/6] iotests: Valgrind fails with nonexistent directory

2019-07-19 Thread Andrey Shinkevich
The Valgrind uses the exported variable TMPDIR and fails if the
directory does not exist. Let us exclude such a test case from
being run under the Valgrind and notify the user of it.

Suggested-by: Kevin Wolf 
Signed-off-by: Andrey Shinkevich 
---
 tests/qemu-iotests/051 | 4 
 1 file changed, 4 insertions(+)

diff --git a/tests/qemu-iotests/051 b/tests/qemu-iotests/051
index ce942a5..f8141ca 100755
--- a/tests/qemu-iotests/051
+++ b/tests/qemu-iotests/051
@@ -377,6 +377,10 @@ printf %b "qemu-io $device_id \"write -P 0x33 0 
4k\"\ncommit $device_id\n" |
 $QEMU_IO -c "read -P 0x33 0 4k" "$TEST_IMG" | _filter_qemu_io
 
 # Using snapshot=on with a non-existent TMPDIR
+if [ "${VALGRIND_QEMU}" == "y" ]; then
+_casenotrun "Valgrind needs a valid TMPDIR for itself"
+fi
+VALGRIND_QEMU="" \
 TMPDIR=/nonexistent run_qemu -drive driver=null-co,snapshot=on
 
 # Using snapshot=on together with read-only=on
-- 
1.8.3.1




[Qemu-devel] [PATCH v5 1/6] iotests: allow Valgrind checking all QEMU processes

2019-07-19 Thread Andrey Shinkevich
With the '-valgrind' option, let all the QEMU processes be run under
the Valgrind tool. The Valgrind own parameters may be set with its
environment variable VALGRIND_OPTS, e.g.
VALGRIND_OPTS="--leak-check=yes" ./check -qcow2 -valgrind 
or they may be listed in the Valgrind checked file ./.valgrindrc or
~/.valgrindrc like
--memcheck:leak-check=no
--memcheck:track-origins=yes
When QEMU-IO process is being killed, the shell report refers to the
text of the command in _qemu_io_wrapper(), which was modified with this
patch. So, the benchmark output for the tests 039, 061 and 137 is to be
changed also.

Signed-off-by: Andrey Shinkevich 
---
 tests/qemu-iotests/039.out   | 30 ---
 tests/qemu-iotests/061.out   | 12 ++--
 tests/qemu-iotests/137.out   |  6 +---
 tests/qemu-iotests/common.rc | 69 
 4 files changed, 59 insertions(+), 58 deletions(-)

diff --git a/tests/qemu-iotests/039.out b/tests/qemu-iotests/039.out
index 724d7b2..972c6c0 100644
--- a/tests/qemu-iotests/039.out
+++ b/tests/qemu-iotests/039.out
@@ -11,11 +11,7 @@ No errors were found on the image.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 wrote 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
-exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
"$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-else
-exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-fi )
+./common.rc: Killed  ( _qemu_proc_wrapper 
"${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
 incompatible_features 0x1
 ERROR cluster 5 refcount=0 reference=1
 ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
@@ -50,11 +46,7 @@ read 512/512 bytes at offset 0
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 wrote 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
-exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
"$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-else
-exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-fi )
+./common.rc: Killed  ( _qemu_proc_wrapper 
"${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
 incompatible_features 0x1
 ERROR cluster 5 refcount=0 reference=1
 Rebuilding refcount structure
@@ -68,11 +60,7 @@ incompatible_features 0x0
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 wrote 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
-exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
"$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-else
-exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-fi )
+./common.rc: Killed  ( _qemu_proc_wrapper 
"${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
 incompatible_features 0x0
 No errors were found on the image.
 
@@ -91,11 +79,7 @@ No errors were found on the image.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 wrote 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
-exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
"$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-else
-exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-fi )
+./common.rc: Killed  ( _qemu_proc_wrapper 
"${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
 incompatible_features 0x1
 ERROR cluster 5 refcount=0 reference=1
 ERROR OFLAG_COPIED data cluster: l2_entry=8005 refcount=0
@@ -105,11 +89,7 @@ Data may be corrupted, or further writes to the image may 
corrupt it.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=134217728
 wrote 512/512 bytes at offset 0
 512 bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
-exec valgrind --log-file="${VALGRIND_LOGFILE}" --error-exitcode=99 
"$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-else
-exec "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@";
-fi )
+./common.rc: Killed  ( _qemu_proc_wrapper 
"${VALGRIND_LOGFILE}" "$QEMU_IO_PROG" $QEMU_IO_ARGS "$@" )
 incompatible_features 0x0
 No errors were found on the image.
 *** done
diff --git a/tests/qemu-iotests/061.out b/tests/qemu-iotests/061.out
index 1aa7d37..8cb57eb 100644
--- a/tests/qemu-iotests/061.out
+++ b/tests/qemu-iotests/061.out
@@ -118,11 +118,7 @@ No errors were found on the image.
 Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864
 wrote 131072/131072 bytes at offset 0
 128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-./common.rc: Killed  ( if [ "${VALGRIND_QEMU}" == "y" ]; then
-exec valgrind 

[Qemu-devel] [PATCH v5 2/6] iotests: exclude killed processes from running under Valgrind

2019-07-19 Thread Andrey Shinkevich
 The Valgrind tool fails to manage its termination when QEMU raises the
 signal SIGKILL in the multi-threaded process. The bug has been
 reported to the Valgrind maintainers and was registered as Bug 409141.
 Let's exclude such test cases from running under the Valgrind until
 new release of it because checking for the memory issues is covered by
 other test cases.

Signed-off-by: Andrey Shinkevich 
---
 tests/qemu-iotests/039 | 5 +
 tests/qemu-iotests/061 | 2 ++
 tests/qemu-iotests/137 | 1 +
 3 files changed, 8 insertions(+)

diff --git a/tests/qemu-iotests/039 b/tests/qemu-iotests/039
index 0d4e963..95115e2 100755
--- a/tests/qemu-iotests/039
+++ b/tests/qemu-iotests/039
@@ -65,6 +65,7 @@ echo "== Creating a dirty image file =="
 IMGOPTS="compat=1.1,lazy_refcounts=on"
 _make_test_img $size
 
+VALGRIND_QEMU="" \
 $QEMU_IO -c "write -P 0x5a 0 512" \
  -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 \
 | _filter_qemu_io
@@ -100,6 +101,7 @@ echo "== Opening a dirty image read/write should repair it 
=="
 IMGOPTS="compat=1.1,lazy_refcounts=on"
 _make_test_img $size
 
+VALGRIND_QEMU="" \
 $QEMU_IO -c "write -P 0x5a 0 512" \
  -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 \
 | _filter_qemu_io
@@ -118,6 +120,7 @@ echo "== Creating an image file with lazy_refcounts=off =="
 IMGOPTS="compat=1.1,lazy_refcounts=off"
 _make_test_img $size
 
+VALGRIND_QEMU="" \
 $QEMU_IO -c "write -P 0x5a 0 512" \
  -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 \
 | _filter_qemu_io
@@ -151,6 +154,7 @@ echo "== Changing lazy_refcounts setting at runtime =="
 IMGOPTS="compat=1.1,lazy_refcounts=off"
 _make_test_img $size
 
+VALGRIND_QEMU="" \
 $QEMU_IO -c "reopen -o lazy-refcounts=on" \
  -c "write -P 0x5a 0 512" \
  -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 \
@@ -163,6 +167,7 @@ _check_test_img
 IMGOPTS="compat=1.1,lazy_refcounts=on"
 _make_test_img $size
 
+VALGRIND_QEMU="" \
 $QEMU_IO -c "reopen -o lazy-refcounts=off" \
  -c "write -P 0x5a 0 512" \
  -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 \
diff --git a/tests/qemu-iotests/061 b/tests/qemu-iotests/061
index d7dbd7e..5d0724c 100755
--- a/tests/qemu-iotests/061
+++ b/tests/qemu-iotests/061
@@ -73,6 +73,7 @@ echo
 echo "=== Testing dirty version downgrade ==="
 echo
 IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img 64M
+VALGRIND_QEMU="" \
 $QEMU_IO -c "write -P 0x2a 0 128k" -c flush \
  -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 | _filter_qemu_io
 $PYTHON qcow2.py "$TEST_IMG" dump-header
@@ -107,6 +108,7 @@ echo
 echo "=== Testing dirty lazy_refcounts=off ==="
 echo
 IMGOPTS="compat=1.1,lazy_refcounts=on" _make_test_img 64M
+VALGRIND_QEMU="" \
 $QEMU_IO -c "write -P 0x2a 0 128k" -c flush \
  -c "sigraise $(kill -l KILL)" "$TEST_IMG" 2>&1 | _filter_qemu_io
 $PYTHON qcow2.py "$TEST_IMG" dump-header
diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
index 0c3d2a1..a442fc8 100755
--- a/tests/qemu-iotests/137
+++ b/tests/qemu-iotests/137
@@ -130,6 +130,7 @@ echo
 
 # Whether lazy-refcounts was actually enabled can easily be tested: Check if
 # the dirty bit is set after a crash
+VALGRIND_QEMU="" \
 $QEMU_IO \
 -c "reopen -o lazy-refcounts=on,overlap-check=blubb" \
 -c "write -P 0x5a 0 512" \
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH-for-4.1] crypto: Fix data type for len parameter in two typedefs

2019-07-19 Thread Daniel P . Berrangé
On Thu, Jul 18, 2019 at 09:51:23PM +0200, Stefan Weil wrote:
> Signed-off-by: Stefan Weil 
> ---
>  crypto/hash-nettle.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

FYI, A different fix for this problem is now merged.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH 3/3] riscv: sifive_u: Fix clock-names property for ethernet node

2019-07-19 Thread Alistair Francis
On Fri, Jul 19, 2019 at 6:41 AM Guenter Roeck  wrote:
>
> The correct property name is clock-names, not clocks-names.
>
> Without this patch, the Ethernet driver fails to instantiate with
> the following error.
>
> macb 100900fc.ethernet: failed to get macb_clk (-2)
> macb: probe of 100900fc.ethernet failed with error -2
>
> Signed-off-by: Guenter Roeck 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/sifive_u.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 5a221c6..64e233d 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -215,7 +215,7 @@ static void *create_fdt(SiFiveUState *s, const struct 
> MemmapEntry *memmap,
>  qemu_fdt_setprop_cells(fdt, nodename, "interrupts", SIFIVE_U_GEM_IRQ);
>  qemu_fdt_setprop_cells(fdt, nodename, "clocks",
>  ethclk_phandle, ethclk_phandle, ethclk_phandle);
> -qemu_fdt_setprop(fdt, nodename, "clocks-names", ethclk_names,
> +qemu_fdt_setprop(fdt, nodename, "clock-names", ethclk_names,
>  sizeof(ethclk_names));
>  qemu_fdt_setprop_cells(fdt, nodename, "#address-cells", 1);
>  qemu_fdt_setprop_cells(fdt, nodename, "#size-cells", 0);
> --
> 2.7.4
>
>



Re: [Qemu-devel] [PATCH-for-4.1 v7 1/1] hw/block/pflash_cfi01: Add missing DeviceReset() handler

2019-07-19 Thread Philippe Mathieu-Daudé
Hi Laszlo,

On 7/18/19 9:35 PM, Philippe Mathieu-Daudé wrote:
> On 7/18/19 8:38 PM, Laszlo Ersek wrote:
>> On 07/18/19 17:03, Laszlo Ersek wrote:
>>> On 07/18/19 12:48, Philippe Mathieu-Daudé wrote:
 To avoid incoherent states when the machine resets (see but report
[...]>>> (3) Using OVMF IA32X64 (including the edk2 SMM stack), I've
>>> regression-tested this patch, on top of v4.1.0-rc1, with KVM. As follows:
>>>
>>> (3a) Normal reboot from the UEFI shell ("reset -c" command)
>>>
>>> (3b) Normal reboot from the Linux guest prompt ("reboot" command)
>>>
>>> (3c1) Reset as part of ACPI S3 suspend/resume
>>> (3c2) then use "efibootmgr -n / -N" to write to pflash (by virtue of
>>> setting / deleting the standardized BootNext UEFI variable)
>>>
>>> (3d1) Boot to setup TUI with SB enabled
>>> (3d2) erase Platform Key in setup TUI (disables SB)
>>> (3d3) reboot from within setup TUI
>>> (3d4) proceed to UEFI shell
>>> (3d5) enable SB with EnrollDefaultKeys.efi
>>> (3d6) reboot from UEFI shell
>>> (3d7) proceeed to Linux guest
>>> (3d8) verify SB enablement (dmesg, "mokutil --sb-state")
>>>
>>> (As an added exercise, step (3d4) triggered an "FTW" (fault tolerant
>>> write) "reclaim" (basically a defragmentation of the journaled
>>> "filesystem" that the firmware keeps in the flash, as a logical "middle
>>> layer"), and that worked fine too.)
>>>
>>> Regression-tested-by: Laszlo Ersek 
>>>
>>>
>>> (4) I plan to provide R-t-b in the evening from aarch64 KVM too, using
>>> the edk2 ArmVirtQemu firmware. Only the first two steps from (3) will be
>>> covered (no ACPI S3, no SB).
>>
>> Regression-tested-by: Laszlo Ersek 

Patchwork doesn't recognize your R-t-b tag:

https://patchwork.ozlabs.org/patch/1133671/

Should I change it for a Tested-by, or add as it?

Thanks,

Phil.

> Thank you a lot again for all your testing, I also noted your steps and
> will try to automate them.
> 
> Best regards,
> 
> Phil.
> 



Re: [Qemu-devel] [PULL] RISC-V Patches for 4.2-rc2

2019-07-19 Thread Alistair Francis
On Fri, Jul 19, 2019 at 4:11 AM Peter Maydell  wrote:
>
> On Fri, 19 Jul 2019 at 12:03, Peter Maydell  wrote:
> > This passes the 'make check' tests but it prints out a lot
> > of warnings as it does so:
> >
> > qemu-system-riscv64: warning: No -bios option specified. Not loading a 
> > firmware.
> > qemu-system-riscv64: warning: This default will change in QEMU 4.3.
> > Please use the -bios option to aviod breakages when this happens.
> > qemu-system-riscv64: warning: See QEMU's deprecation documentation for 
> > details
> >
> > (repeated 7 or 8 times during the course of a test run)
> >
> > Can we make the tests not trigger warnings, please?
> > (I have a filter where I search through for strings like
> > "warning" because warnings that shouldn't happen often don't
> > actually cause the tests to fail.)
>
> Forgot to mention, but a common way to do this is to say
> "don't print the warnings about bios image loading if
> qtest_enabled(), because with qtest we never execute any
> guest code anyway". That will probably fix the warnings here.

Testing my patch now, I'll send it out today.

>
> > Also, I notice that you have a typo: "aviod" should be "avoid".

Fixed.

>
> Also also, the warning message mentions "QEMU 4.3", but our
> versioning system bumps the major version every year, so
> the pending release is 4.1, the next one will be 4.2, and
> then the release after that will be 5.0 because it will be the
> first release in 2020.

I just dropped the version number in the warning message.

Alistair

>
> (Plus, your merge commit message says this pullreq is
> for 4.2-rc2, which is a typo for 4.1-rc2 I assume.)
>
> Since this pullreq does pass the tests, and rc2 is not far off
> now (Tuesday), I think my suggestion is that I'll apply this
> as-is, and we should fix up the issues with the warning messages
> as a followup patch. I think that's better than holding this
> out of master and making it risk missing rc2.
>
> thanks
> -- PMM



[Qemu-devel] [Bug 1829242] Re: qemu on windows host exits after savevm command

2019-07-19 Thread Dr. David Alan Gilbert
that code should be ok; if you can find it with a debug I'd try and
figure out what block and page pss is currently pointing to.  Is it
normal RAM or something special?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1829242

Title:
  qemu on windows host exits after savevm command

Status in QEMU:
  New

Bug description:
  I'm running qemu-system-i386.exe 3.1.0 with this command line:
  "C:\Program Files\qemu\qemu-system-i386.exe"  -L C:\user\qemu\pc-bios\ -name 
win7 -m 4G -uuid 564db62e-e031-b5cf-5f34-a75f8cefa98e -rtc base=localtime 
-accel hax -hdd C:\VirtualMachines\Dev\Win10x64_VS17\swap.qcow 
"C:\VirtualMachines\qemu\qemu_win7.qcow"
  Host OS Windows 10 x64, guest OS Wondows 7 x86.

  Wait till the OS loads, go to compat_monitor0 tab and enter command:
  savevm loaded_win
  After a few seconds qemu exits, running it another time and entering command:
  info snapshots
  says "There is no snapshot available". I've tried rinning it with -accel tcg, 
with same results. I've tried less memory (1G), same results.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1829242/+subscriptions



[Qemu-devel] [PATCH v1 0/3] virtio-balloon: PartialBalloonedPage rework

2019-07-19 Thread David Hildenbrand
Michael pointed out that stroing and using the address of a RAMBlock
might not be safe. So let's rework the pbp handling, cleaning up the
code.

Did a sanity test with hugepage backing on x86.64.

We might want to have this in 4.1. I'll let Michael decide.

Cc: Stefan Hajnoczi 
Cc: David Gibson 
Cc: Michael S. Tsirkin 
Cc: Igor Mammedov 

David Hildenbrand (3):
  virtio-balloon: simplify deflate with pbp
  virtio-balloon: Better names for offset variables in inflate/deflate
code
  virtio-balloon: Rework pbp tracking data

 hw/virtio/virtio-balloon.c | 90 ++
 include/hw/virtio/virtio-balloon.h | 15 -
 2 files changed, 53 insertions(+), 52 deletions(-)

-- 
2.21.0




Re: [Qemu-devel] [PULL 00/19] Bugfix/cleanup patches for 2019-07-16

2019-07-19 Thread Peter Maydell
On Tue, 16 Jul 2019 at 09:11, Paolo Bonzini  wrote:
>
> The following changes since commit 46cd24e7ed38191b5ab5c40a836d6c5b6b604f8a:
>
>   Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging 
> (2019-07-12 17:34:13 +0100)
>
> are available in the git repository at:
>
>
>   git://github.com/bonzini/qemu.git tags/for-upstream
>
> for you to fetch changes up to 45d8bc3adedeceaf449d758aee1810bfbe6feff4:
>
>   vl: make sure char-pty message displayed by moving setbuf to the beginning 
> (2019-07-16 09:27:16 +0200)
>
> 
> * VFIO bugfix for AMD SEV (Alex)
> * Kconfig improvements (Julio, Philippe)
> * MemoryRegion reference counting bugfix (King Wang)
> * Build system cleanups (Marc-André, myself)
> * rdmacm-mux off-by-one (Marc-André)
> * ZBC passthrough fixes (Shinichiro, myself)
> * WHPX build fix (Stefan)
> * char-pty fix (Wei Yang)

Hi Paolo -- it looks like this may have broken the
travis config "--without-default-devices":

Here's a sample failing build:
https://travis-ci.org/qemu/qemu/jobs/559509325

minikconf barfs with "contradiction between clauses"

Traceback (most recent call last):
  File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 703,
in 
config = data.compute_config()
  File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 253,
in compute_config
clause.process()
  File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 188, in process
self.dest.set_value(False, self)
  File "/home/travis/build/qemu/qemu/scripts/minikconf.py", line 118,
in set_value
raise KconfigDataError('contradiction between clauses when setting
%s' % self)
__main__.KconfigDataError: contradiction between clauses when setting VMMOUSE


I guess this is Julio's commit 97fd1ea8c10658?

thanks
-- PMM



Re: [Qemu-devel] [PATCH] nbd: Initialize reply on failure

2019-07-19 Thread Eric Blake
On 7/19/19 10:53 AM, Philippe Mathieu-Daudé wrote:

>>  if (ret < 0) {
>> +memset(reply, 0, sizeof *reply);
> 
> I never had problem with sizeof without parenthesis, but here I find it
> not easy to review.

Holdover from my work on GNU coding style projects: the rationale is
that you can tell 'sizeof(type)' apart from 'sizeof expr' if you always
omit the () in the latter case, which further makes it possible to tell
at a glance when you are using the expr form (preferred, because the
type of the expression can change and the sizeof is still correct) or a
type name (harder to audit, since changing a variable's type means you
also have to change any associated sizeof in later code using that
variable).  But I will readily admit qemu is not GNU style:

$ git grep 'sizeof ' | wc
3942500   29756
$ git grep 'sizeof(' | wc
   8899   46172  671537

so I've fixed it to use ().

> 
> Anyhow, this definitively looks like 4.1 material.
> 
> Preferently with sizeof(), but I don't mind, so:
> Reviewed-by: Philippe Mathieu-Daudé 

Thanks.  Queued on my NBD tree for -rc2.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v27 7/8] target/avr: Register AVR support with the rest of QEMU, the build system, and the MAINTAINERS file

2019-07-19 Thread Michael Rolnik
You're right.
I did not know what next release was going to be. So I did not change it

Sent from my cell phone, please ignore typos

On Fri, Jul 19, 2019, 6:43 PM Eric Blake  wrote:

> On 7/19/19 3:26 AM, Michael Rolnik wrote:
> > Signed-off-by: Michael Rolnik 
> > ---
>
> > +++ b/qapi/common.json
> > @@ -183,11 +183,12 @@
> >  #is true even for "qemu-system-x86_64".
> >  #
> >  # ppcemb: dropped in 3.1
> > +# avr: added in 4.1
>
> Are you trying to get this into 4.1?  rc2 is awfully late to be
> introducing a new target. I suspect this should be 4.2.
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org
>
>


[Qemu-devel] [PATCH v1 2/3] virtio-balloon: Better names for offset variables in inflate/deflate code

2019-07-19 Thread David Hildenbrand
"host_page_base" is really confusing, let's make this clearer, also
rename the other offsets to indicate to which base they apply.

offset -> mr_offset
ram_offset -> rb_offset
host_page_base -> rb_aligned_offset

While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
and move the computation to the place where it is needed.

Signed-off-by: David Hildenbrand 
---
 hw/virtio/virtio-balloon.c | 26 +-
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 287d5d4c97..29cee344b2 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -47,24 +47,23 @@ static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
- MemoryRegion *mr, hwaddr offset)
+ MemoryRegion *mr, hwaddr mr_offset)
 {
-void *addr = memory_region_get_ram_ptr(mr) + offset;
+void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+ram_addr_t rb_offset, rb_aligned_offset;
 RAMBlock *rb;
 size_t rb_page_size;
 int subpages;
-ram_addr_t ram_offset, host_page_base;
 
 /* XXX is there a better way to get to the RAMBlock than via a
  * host address? */
-rb = qemu_ram_block_from_host(addr, false, _offset);
+rb = qemu_ram_block_from_host(addr, false, _offset);
 rb_page_size = qemu_ram_pagesize(rb);
-host_page_base = ram_offset & ~(rb_page_size - 1);
 
 if (rb_page_size == BALLOON_PAGE_SIZE) {
 /* Easy case */
 
-ram_block_discard_range(rb, ram_offset, rb_page_size);
+ram_block_discard_range(rb, rb_offset, rb_page_size);
 /* We ignore errors from ram_block_discard_range(), because it
  * has already reported them, and failing to discard a balloon
  * page is not fatal */
@@ -80,11 +79,12 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 warn_report_once(
 "Balloon used with backing page size > 4kiB, this may not be reliable");
 
+rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
 subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
 if (balloon->pbp
 && (rb != balloon->pbp->rb
-|| host_page_base != balloon->pbp->base)) {
+|| rb_aligned_offset != balloon->pbp->base)) {
 /* We've partially ballooned part of a host page, but now
  * we're trying to balloon part of a different one.  Too hard,
  * give up on the old partial page */
@@ -96,10 +96,10 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
 balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
 balloon->pbp->rb = rb;
-balloon->pbp->base = host_page_base;
+balloon->pbp->base = rb_aligned_offset;
 }
 
-set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
 balloon->pbp->bitmap);
 
 if (bitmap_full(balloon->pbp->bitmap, subpages)) {
@@ -116,18 +116,18 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 }
 
 static void balloon_deflate_page(VirtIOBalloon *balloon,
- MemoryRegion *mr, hwaddr offset)
+ MemoryRegion *mr, hwaddr mr_offset)
 {
-void *addr = memory_region_get_ram_ptr(mr) + offset;
+void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+ram_addr_t rb_offset;
 RAMBlock *rb;
 size_t rb_page_size;
-ram_addr_t ram_offset;
 void *host_addr;
 int ret;
 
 /* XXX is there a better way to get to the RAMBlock than via a
  * host address? */
-rb = qemu_ram_block_from_host(addr, false, _offset);
+rb = qemu_ram_block_from_host(addr, false, _offset);
 rb_page_size = qemu_ram_pagesize(rb);
 
 if (balloon->pbp) {
-- 
2.21.0




[Qemu-devel] [PATCH v1 3/3] virtio-balloon: Rework pbp tracking data

2019-07-19 Thread David Hildenbrand
Using the address of a RAMBlock to test for a matching pbp is not really
safe. Instead, let's use the guest physical address of the base page
along with the page size (via the number of subpages).

While at it, move "struct PartiallyBalloonedPage" to virtio-balloon.h
now (previously most probably avoided to te the ramblock), renaming it.

Also, let's only dynamically allocating the bitmap. This makes the code
easier to read and maintain - we can reuse bitmap_new().

Signed-off-by: David Hildenbrand 
---
 hw/virtio/virtio-balloon.c | 52 +-
 include/hw/virtio/virtio-balloon.h | 15 +++--
 2 files changed, 42 insertions(+), 25 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 29cee344b2..8e5f9459c2 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -34,23 +34,31 @@
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
-struct PartiallyBalloonedPage {
-RAMBlock *rb;
-ram_addr_t base;
-unsigned long bitmap[];
-};
-
 static void virtio_balloon_reset_pbp(VirtIOBalloon *balloon)
 {
-g_free(balloon->pbp);
-balloon->pbp = NULL;
+balloon->pbp.base_gpa = 0;
+balloon->pbp.subpages = 0;
+g_free(balloon->pbp.bitmap);
+balloon->pbp.bitmap = NULL;
+}
+
+static bool virtio_balloon_pbp_valid(VirtIOBalloon *balloon)
+{
+return balloon->pbp.bitmap;
+}
+
+static bool virtio_balloon_pbp_matches(VirtIOBalloon *balloon,
+   ram_addr_t base_gpa, long subpages)
+{
+return balloon->pbp.subpages == subpages &&
+   balloon->pbp.base_gpa == base_gpa;
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
  MemoryRegion *mr, hwaddr mr_offset)
 {
 void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
-ram_addr_t rb_offset, rb_aligned_offset;
+ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
 RAMBlock *rb;
 size_t rb_page_size;
 int subpages;
@@ -81,32 +89,32 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 
 rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
 subpages = rb_page_size / BALLOON_PAGE_SIZE;
+base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
+   (rb_offset - rb_aligned_offset);
 
-if (balloon->pbp
-&& (rb != balloon->pbp->rb
-|| rb_aligned_offset != balloon->pbp->base)) {
+if (virtio_balloon_pbp_valid(balloon) &&
+!virtio_balloon_pbp_matches(balloon, base_gpa, subpages)) {
 /* We've partially ballooned part of a host page, but now
  * we're trying to balloon part of a different one.  Too hard,
  * give up on the old partial page */
 virtio_balloon_reset_pbp(balloon);
 }
 
-if (!balloon->pbp) {
+if (!virtio_balloon_pbp_valid(balloon)) {
 /* Starting on a new host page */
-size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
-balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
-balloon->pbp->rb = rb;
-balloon->pbp->base = rb_aligned_offset;
+balloon->pbp.base_gpa = base_gpa;
+balloon->pbp.subpages = subpages;
+balloon->pbp.bitmap = bitmap_new(subpages);
 }
 
-set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-balloon->pbp->bitmap);
+set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
+balloon->pbp.bitmap);
 
-if (bitmap_full(balloon->pbp->bitmap, subpages)) {
+if (bitmap_full(balloon->pbp.bitmap, subpages)) {
 /* We've accumulated a full host page, we can actually discard
  * it now */
 
-ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
+ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
 /* We ignore errors from ram_block_discard_range(), because it
  * has already reported them, and failing to discard a balloon
  * page is not fatal */
@@ -130,7 +138,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
 rb = qemu_ram_block_from_host(addr, false, _offset);
 rb_page_size = qemu_ram_pagesize(rb);
 
-if (balloon->pbp) {
+if (virtio_balloon_pbp_valid(balloon)) {
 /* Let's play safe and always reset the pbp on deflation requests. */
 virtio_balloon_reset_pbp(balloon);
 }
diff --git a/include/hw/virtio/virtio-balloon.h 
b/include/hw/virtio/virtio-balloon.h
index 5a99293a45..0190d78d71 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
uint64_t val;
 } VirtIOBalloonStatModern;
 
-typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
-
 enum virtio_balloon_free_page_report_status {
 FREE_PAGE_REPORT_S_STOP = 0,
 FREE_PAGE_REPORT_S_REQUESTED = 1,
@@ -42,6 +40,12 @@ enum virtio_balloon_free_page_report_status {
 FREE_PAGE_REPORT_S_DONE = 3,

[Qemu-devel] [PATCH v1 1/3] virtio-balloon: simplify deflate with pbp

2019-07-19 Thread David Hildenbrand
Let's simplify this - the case we are optimizing for is very hard to
trigger and not worth the effort. If we're switching from inflation to
deflation, let's reset the pbp.

Signed-off-by: David Hildenbrand 
---
 hw/virtio/virtio-balloon.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 9de3c030bf..287d5d4c97 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -121,7 +121,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
 void *addr = memory_region_get_ram_ptr(mr) + offset;
 RAMBlock *rb;
 size_t rb_page_size;
-ram_addr_t ram_offset, host_page_base;
+ram_addr_t ram_offset;
 void *host_addr;
 int ret;
 
@@ -129,26 +129,10 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
  * host address? */
 rb = qemu_ram_block_from_host(addr, false, _offset);
 rb_page_size = qemu_ram_pagesize(rb);
-host_page_base = ram_offset & ~(rb_page_size - 1);
-
-if (balloon->pbp
-&& rb == balloon->pbp->rb
-&& host_page_base == balloon->pbp->base) {
-int subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
-/*
- * This means the guest has asked to discard some of the 4kiB
- * subpages of a host page, but then changed its mind and
- * asked to keep them after all.  It's exceedingly unlikely
- * for a guest to do this in practice, but handle it anyway,
- * since getting it wrong could mean discarding memory the
- * guest is still using. */
-clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-  balloon->pbp->bitmap);
-
-if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
-virtio_balloon_reset_pbp(balloon);
-}
+if (balloon->pbp) {
+/* Let's play safe and always reset the pbp on deflation requests. */
+virtio_balloon_reset_pbp(balloon);
 }
 
 host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
-- 
2.21.0




  1   2   3   >