Re: [RFC PATCH v2 4/6] hw/acpi/aml-build: Add processor hierarchy node structure

2021-04-27 Thread wangyanan (Y)

Hi Drew,

On 2021/4/27 21:37, Andrew Jones wrote:

On Tue, Apr 13, 2021 at 04:07:43PM +0800, Yanan Wang wrote:

Add a generic API to build Processor Hierarchy Node Structure(Type 0),
which is strictly consistent with descriptions in ACPI 6.3: 5.2.29.1.

This function will be used to build ACPI PPTT table for cpu topology.

Signed-off-by: Ying Fang 
Signed-off-by: Henglong Fan 
Signed-off-by: Yanan Wang 
---
  hw/acpi/aml-build.c | 27 +++
  include/hw/acpi/aml-build.h |  4 
  2 files changed, 31 insertions(+)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index d33ce8954a..75e01aea17 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1916,6 +1916,33 @@ void build_slit(GArray *table_data, BIOSLinker *linker, 
MachineState *ms,
   table_data->len - slit_start, 1, oem_id, oem_table_id);
  }
  
+/*

+ * ACPI 6.3: 5.2.29.1 Processor Hierarchy Node Structure (Type 0)

  ^ Doesn't this table show up in 6.2 first? We should always
use the oldest specification we can.

Yes, it shows up for the first time in 6.2 specification.

Also, please don't capitalize Hierarchy, Node, and Structure. Those words
are not capitalized in the spec section name and we want an exact match
here.
Indeed, the exact format in spec is "5.2.29.1 Processor hierarchy node 
structure (Type 0)"

I will fix this.

+ */
+void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,
+uint32_t parent, uint32_t id,
+uint32_t *priv_rsrc, uint32_t priv_num)
+{
+int i;
+
+build_append_byte(tbl, 0); /* Type 0 - processor */
+build_append_byte(tbl, 20 + priv_num * 4); /* Length */
+build_append_int_noprefix(tbl, 0, 2);  /* Reserved */
+build_append_int_noprefix(tbl, flags, 4);  /* Flags */
+build_append_int_noprefix(tbl, parent, 4); /* Parent */
+build_append_int_noprefix(tbl, id, 4); /* ACPI processor ID */

   ^ should be
capitalized like in the spec

Right.
 

+
+/* Number of private resources */
+build_append_int_noprefix(tbl, priv_num, 4);
+
+/* Private resources[N] */
+if (priv_num > 0 && priv_rsrc != NULL) {

Since we should never have priv_num > 0 and priv_rsrc == NULL, then we can
do

if (priv_num > 0) {
assert(priv_rsrc);
...

It seems much better, will fit it.

Thanks,
Yanan

+for (i = 0; i < priv_num; i++) {
+build_append_int_noprefix(tbl, priv_rsrc[i], 4);
+}
+}
+}
+
  /* build rev1/rev3/rev5.1 FADT */
  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
  const char *oem_id, const char *oem_table_id)
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 471266d739..ea74b8f6ed 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -462,6 +462,10 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, 
uint64_t base,
  void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms,
  const char *oem_id, const char *oem_table_id);
  
+void build_processor_hierarchy_node(GArray *tbl, uint32_t flags,

+uint32_t parent, uint32_t id,
+uint32_t *priv_rsrc, uint32_t priv_num);
+
  void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f,
  const char *oem_id, const char *oem_table_id);
  
--

2.19.1


Thanks,
drew

.




Re: [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus

2021-04-27 Thread wangyanan (Y)

Hi Drew,

On 2021/4/27 22:50, Andrew Jones wrote:

On Tue, Apr 13, 2021 at 04:07:42PM +0800, Yanan Wang wrote:

From: Ying Fang 

When building ACPI tables regarding CPUs we should always build
them for the number of possible CPUs, not the number of present
CPUs. We then ensure only the present CPUs are enabled in MADT.
Furthermore, it is also needed if we are going to support CPU
hotplug in the future.

This patch is a rework based on Andrew Jones's contribution at
https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html

Thank you for this credit, but I think I'd prefer a Co-developed-by
tag instead, if you don't mind.

No problem. I will add your Co-developed-by tag and you deserve that.
There may still be something inappropriate about credit in this series,
but I will try to make things all right.

Thanks,
Yanan

Thanks,
drew

.




Re: [RFC PATCH v2 3/6] hw/arm/virt-acpi-build: Distinguish possible and present cpus

2021-04-27 Thread wangyanan (Y)

On 2021/4/27 21:18, Andrew Jones wrote:

On Tue, Apr 13, 2021 at 04:07:42PM +0800, Yanan Wang wrote:

From: Ying Fang 

When building ACPI tables regarding CPUs we should always build
them for the number of possible CPUs, not the number of present
CPUs. We then ensure only the present CPUs are enabled in MADT.
Furthermore, it is also needed if we are going to support CPU
hotplug in the future.

This patch is a rework based on Andrew Jones's contribution at
https://lists.gnu.org/archive/html/qemu-arm/2018-07/msg00076.html

Signed-off-by: Ying Fang 
Signed-off-by: Yanan Wang 
---
  hw/arm/virt-acpi-build.c | 14 ++
  hw/arm/virt.c|  3 +++
  2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index f5a2b2d4cb..2ad5dad1bf 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -61,13 +61,16 @@
  
  static void acpi_dsdt_add_cpus(Aml *scope, VirtMachineState *vms)

  {
-MachineState *ms = MACHINE(vms);
+CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
  uint16_t i;
  
-for (i = 0; i < ms->smp.cpus; i++) {

+for (i = 0; i < possible_cpus->len; i++) {
  Aml *dev = aml_device("C%.03X", i);
  aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
  aml_append(dev, aml_name_decl("_UID", aml_int(i)));
+if (possible_cpus->cpus[i].cpu == NULL) {
+aml_append(dev, aml_name_decl("_STA", aml_int(0)));
+}
  aml_append(scope, dev);
  }
  }
@@ -479,6 +482,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
  const int *irqmap = vms->irqmap;
  AcpiMadtGenericDistributor *gicd;
  AcpiMadtGenericMsiFrame *gic_msi;
+CPUArchIdList *possible_cpus = MACHINE(vms)->possible_cpus;
  int i;
  
  acpi_data_push(table_data, sizeof(AcpiMultipleApicTable));

@@ -489,7 +493,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
  gicd->base_address = cpu_to_le64(memmap[VIRT_GIC_DIST].base);
  gicd->version = vms->gic_version;
  
-for (i = 0; i < MACHINE(vms)->smp.cpus; i++) {

+for (i = 0; i < possible_cpus->len; i++) {
  AcpiMadtGenericCpuInterface *gicc = acpi_data_push(table_data,
 sizeof(*gicc));
  ARMCPU *armcpu = ARM_CPU(qemu_get_cpu(i));
@@ -504,7 +508,9 @@ build_madt(GArray *table_data, BIOSLinker *linker, 
VirtMachineState *vms)
  gicc->cpu_interface_number = cpu_to_le32(i);
  gicc->arm_mpidr = cpu_to_le64(armcpu->mp_affinity);
  gicc->uid = cpu_to_le32(i);
-gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
+if (possible_cpus->cpus[i].cpu != NULL) {
+gicc->flags = cpu_to_le32(ACPI_MADT_GICC_ENABLED);
+}
  
  if (arm_feature(&armcpu->env, ARM_FEATURE_PMU)) {

  gicc->performance_interrupt = cpu_to_le32(PPI(VIRTUAL_PMU_IRQ));
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index f4ae60ded9..3e5d9b6f26 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2063,6 +2063,9 @@ static void machvirt_init(MachineState *machine)
  }
  
  qdev_realize(DEVICE(cpuobj), NULL, &error_fatal);

+
+/* Initialize cpu member here since cpu hotplug is not supported yet */
+machine->possible_cpus->cpus[n].cpu = cpuobj;

Can drop the 'machine->' as possible_cpus is already set to the pointer.

Right, I will drop it.

Thanks,
Yanan

  object_unref(cpuobj);
  }
  fdt_add_timer_nodes(vms);
--
2.19.1


Thanks,
drew


.




Re: [RFC PATCH v2 2/6] hw/arm/virt: DT: Add cpu-map

2021-04-27 Thread wangyanan (Y)



On 2021/4/27 20:36, Philippe Mathieu-Daudé wrote:

On 4/27/21 12:04 PM, Andrew Jones wrote:

On Tue, Apr 27, 2021 at 11:47:17AM +0200, Philippe Mathieu-Daudé wrote:

Hi Yanan, Drew,

On 4/13/21 10:07 AM, Yanan Wang wrote:

From: Andrew Jones 

Support device tree CPU topology descriptions.

Signed-off-by: Andrew Jones 
Signed-off-by: Yanan Wang 
---
  hw/arm/virt.c | 41 -
  include/hw/arm/virt.h |  1 +
  2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 9f01d9041b..f4ae60ded9 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -352,10 +352,11 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
  int cpu;
  int addr_cells = 1;
  const MachineState *ms = MACHINE(vms);
+const VirtMachineClass *vmc = VIRT_MACHINE_GET_CLASS(vms);
  int smp_cpus = ms->smp.cpus;
  
  /*

- * From Documentation/devicetree/bindings/arm/cpus.txt
+ *  See Linux Documentation/devicetree/bindings/arm/cpus.yaml
   *  On ARM v8 64-bit systems value should be set to 2,
   *  that corresponds to the MPIDR_EL1 register size.
   *  If MPIDR_EL1[63:32] value is equal to 0 on all CPUs
@@ -408,8 +409,45 @@ static void fdt_add_cpu_nodes(const VirtMachineState *vms)
  ms->possible_cpus->cpus[cs->cpu_index].props.node_id);
  }
  
+if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {

+qemu_fdt_setprop_cell(ms->fdt, nodename, "phandle",
+  qemu_fdt_alloc_phandle(ms->fdt));
+}
+
  g_free(nodename);
  }
+
+if (ms->smp.cpus > 1 && !vmc->no_cpu_topology) {
+/*
+ * See Linux Documentation/devicetree/bindings/cpu/cpu-topology.txt
+ * In a SMP system, the hierarchy of CPUs is defined through four
+ * entities that are used to describe the layout of physical CPUs
+ * in the system: socket/cluster/core/thread.
+ */
+qemu_fdt_add_subnode(ms->fdt, "/cpus/cpu-map");
+
+for (cpu = ms->smp.cpus - 1; cpu >= 0; cpu--) {
+char *cpu_path = g_strdup_printf("/cpus/cpu@%d", cpu);
+char *map_path;
+
+if (ms->smp.threads > 1) {
+map_path = g_strdup_printf(
+"/cpus/cpu-map/%s%d/%s%d/%s%d",
+"socket", cpu / (ms->smp.cores * ms->smp.threads),
+"core", (cpu / ms->smp.threads) % ms->smp.cores,
+"thread", cpu % ms->smp.threads);
+} else {
+map_path = g_strdup_printf(
+"/cpus/cpu-map/%s%d/%s%d",
+"socket", cpu / ms->smp.cores,
+"core", cpu % ms->smp.cores);
+}
+qemu_fdt_add_path(ms->fdt, map_path);
+qemu_fdt_setprop_phandle(ms->fdt, map_path, "cpu", cpu_path);
+g_free(map_path);
+g_free(cpu_path);
+}
+}
  }
  
  static void fdt_add_its_gic_node(VirtMachineState *vms)

@@ -2769,6 +2807,7 @@ static void virt_machine_5_2_options(MachineClass *mc)
  virt_machine_6_0_options(mc);
  compat_props_add(mc->compat_props, hw_compat_5_2, hw_compat_5_2_len);
  vmc->no_secure_gpio = true;
+vmc->no_cpu_topology = true;

Bare with me because "machine versioning" is something new to me, I was
expecting it to be only related to migrated fields.
Why do we need to care about not adding the FDT node in older machines?
Shouldn't the guest skip unknown FDT nodes?

It probably should, the question is whether it would. Also, the nodes may
not be unknown, so the guest will read the information and set up its
topology as instructed. That topology may not be the same as what was
getting used by default without the topology description. It's possible
that a user's application has a dependency on the topology and if that
topology gets changed under its feat it'll behave differently.

[*]

I see.


In short, machine versioning isn't just about vmstate, it's also about
keeping a machine type looking the same to the guest.

Yes, TIL.


Now, it's possible that we're being overly cautious here, but this compat
variable doesn't complicate code too much. So I think I'd prefer to use it
than not.

No problem. Could you or Yanan add your first paragraph ([*], reworded
in the commit description? I don't think a comment in the code is
useful, but having it in the commit is helpful IMO.

Hi Philippe,

Of course. I think I can do this for the commit description.

Thanks,
Yanan

Thanks,

Phil.

.




Re: [PATCH RFC 0/1] To add HMP interface to dump PCI MSI-X table/PBA

2021-04-27 Thread Dongli Zhang



On 4/27/21 10:55 PM, Jason Wang wrote:
> 
> 在 2021/4/28 下午1:10, Dongli Zhang 写道:
>> Hi Jason,
>>
>> On 4/27/21 7:31 PM, Jason Wang wrote:
>>> 在 2021/4/27 下午4:53, Dr. David Alan Gilbert 写道:
 * Dongli Zhang (dongli.zh...@oracle.com) wrote:
> On 4/22/21 11:01 PM, Jason Wang wrote:
>> 在 2021/4/23 下午12:47, Dongli Zhang 写道:
>>> This is inspired by the discussion with Jason on below patchset.
>>>
>>> https://urldefense.com/v3/__https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg09020.html__;!!GqivPVa7Brio!KbGQZW5lq3JZ60k12NuWZ6Th1lT6AwmBTF0pBgoWUKKQ4-2UhdW57PtvXUN5XQnZ2NU$
>>>
>>>
>>>
>>> The new HMP command is introduced to dump the MSI-X table and PBA.
>>>
>>> Initially, I was going to add new option to "info pci". However, as the
>>> number of entries is not determined and the output of MSI-X table is 
>>> much
>>> more similar to the output of hmp_info_tlb()/hmp_info_mem(), this patch
>>> adds interface for only HMP.
>>>
>>> The patch is tagged with RFC because I am looking for suggestions on:
>>>
>>> 1. Is it fine to add new "info msix " command?
>> I wonder the reason for not simply reusing "info pci"?
> The "info pci" will show PCI data for all devices and it does not accept 
> any
> argument to print for a specific device.
>
> In addition, the "info pci" relies on qmp_query_pci(), where this patch
> will not
> implement the interface for QMP considering the number of MSI-X entries 
> is not
> determined.
>
> Suppose we have 10 NVMe (emulated by QEMU with default number of queues), 
> we
> will have about 600+ lines of output.
   From an HMP perspective I'm happy, so:

 Acked-by: Dr. David Alan Gilbert 

 but since I don't know much about MSI I'd like to see Jason's reply.
>>>
>>> I think we'd better have more information, e.g the device can optionally 
>>> report
>>> how the MSI-X vector is used.
>>>
>>> Virtio-pci could be the first user for this.
>> As discussed in another thread, you were talking about to print MSIMessage.
>>
>> However, I prefer to print the raw data as I think the user of this interface
>> should be able to understand it as MSI-X messages.
>>
>> For instance, below is the data printed by "info msix".
> 
> 
> Just to clarify, I meant e.g for virtio-pci device, we can let it to print the
> mapping between vq and msix vectors:
> 
> vq[0].msix_vector = 0
> vq[1].msix_vector = 1
> config.msix_vector = 2
> ...
> 
> But this could be added on top if you wish.
> 
> Does this make sense?

Yes, this makes since. For QEMU they are:

- vdev->vq[n].vector
- vdev->config_vector


I will introduce a callback and implement for virtio-pci to dump the vector 
mapping.

By default, "info msix " prints only msix table/PBA.

"info msix -d " will print device specific data.

Thank you very much!

Dongli Zhang

> 
> Thanks
> 
> 
>>
>> 0xfee01004 0x 0x0022 0x
>> 0xfee02004 0x 0x0023 0x
>> 0xfee01004 0x 0x0023 0x
>> 0xfee01004 0x 0x0021 0x
>> 0xfee02004 0x 0x0022 0x
>> 0x 0x 0x 0x0001
>> 0x 0x 0x 0x0001
>>
>> The 1st column is Message Lower Address.
>>
>> The 2nd column is Message Upper Address.
>>
>> The 3rd column is Message Data.
>>
>> The 4th column is Vector Control.
>>
>> In my opinion, this is equivalent to MSIMessage.
>>
>> 26 struct MSIMessage {
>> 27 uint64_t address; --> column 1 and 2
>> 28 uint32_t data;    --> column 3
>> 29 };
>>
>>
>> We use the similar way to read from Linux OS, e,g., given the address of 
>> MSI-X
>> cap, here is how we read from OS side.
>>
>> # busybox devmem 0xc1001000 32
>> 0xFEE0
>> # busybox devmem 0xc1001004 32
>> 0x
>> # busybox devmem 0xc1001008 32
>> 0x4049
>> # busybox devmem 0xc100100c 32
>> 0x
>>
>> Thank you very much!
>>
>> Dongli Zhang
>>
>>>
 Adding an optional option to 'info pci' to limit to one device would be 
 easy
 though; that bit is probably easier than adding a new command.
>>>
>>> One interesting point is that MSI could be extended for other bus, (e.g 
>>> MMIO).
>>> So "info msi" should be better I guess.
>>>
>>>
 Figuring out the QMP representation of your entries might be harder -
 and if this is strictly for debug, probably not worth it?
>>>
>>> I think so.
>>>
>>> Thanks
>>>
>>>
 Dave


> Dongli Zhang
>
>>> 2. Is there any issue with output format?
>> If it's not for QMP, I guess it's not a part of ABI so it should be fine.
>>
>>
>>> 3. Is it fine to add only for HMP, but not QMP?
>> I think so.
>>
>> Thanks
>>
>>
>>> Thank you very much!
>>>
>>> Dongli Zhang
>>>
>>>
>>>
> 



[Bug 1295587] Re: Temporal freeze and slowdown while using emulated sb16

2021-04-27 Thread Thomas Huth
The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.

If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.

Thank you and sorry for the inconvenience.


** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1295587

Title:
  Temporal freeze and slowdown while using emulated sb16

Status in QEMU:
  Incomplete

Bug description:
  I have been carrying around this bug since previous versions and on
  different machines: When I use the -soundhw sb16 option, while playing
  any sound on the virtual machine it temporally freezes the emulated
  machine and loops the last bit of such sound effect for 1-2 minutes,
  then goes back to normal speed (until a new sound is played).

  Console shows:

   sb16: warning: command 0xf9,1 is not truly understood yet
   sb16: warning: command 0xf9,1 is not truly understood yet
  (...)
  main-loop: WARNING: I/O thread spun for 1000 iterations

  -One of my emulated machines is Windows 3.11: I managed to overrun
  this bug by switching from the local 1.5 version of the sound blaster
  driver to the 1.0, although since I updated qemu it freezes that
  machine, so I can't test if it still works.

  I am using the 1.7.90 version, but I suffered this bug for over one
  year (confirmed in version 2.0.0-rc0 too)

  this bug happens anytime I use the -soundhw sb16 switch, but the full
  command I am using in this specific case is:

  qemu-system-i386 -localtime -cpu pentium -m 32 -display sdl -vga
  cirrus -hda c.img -cdrom win95stuff.iso -net nic,model=ne2k_pci -net
  user -soundhw sb16

  This bug appears on all my machines: Pentium III running Slackware
  13.0 and freeBSD 10; Dual core T2400, both in Arch, Gentoo and
  Slackware 14.1 (all 32 bits), and a Dual core T4400 64 bits with
  Gentoo and Slackware. Same problem in all of those systems after
  compiling instead of using the distro packages

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1295587/+subscriptions



[Bug 601946] Re: [Feature request] qemu-img multi-threaded compressed image conversion

2021-04-27 Thread Коренберг Марк
** Changed in: qemu
   Status: Incomplete => New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/601946

Title:
  [Feature request] qemu-img multi-threaded compressed image conversion

Status in QEMU:
  New

Bug description:
  Feature request:
  qemu-img multi-threaded compressed image conversion

  Suppose I want to convert raw image to compressed qcow2. Multi-
  threaded conversion will be much faster, because bottleneck is
  compressing data.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/601946/+subscriptions



Re: [PATCH v3 15/33] nbd/client-connection: use QEMU_LOCK_GUARD

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:53AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  nbd/client-connection.c | 94 ++---
>  1 file changed, 42 insertions(+), 52 deletions(-)
> 
> diff --git a/nbd/client-connection.c b/nbd/client-connection.c
> index 4e39a5b1af..b45a0bd5f6 100644
> --- a/nbd/client-connection.c
> +++ b/nbd/client-connection.c
> @@ -87,17 +87,16 @@ static void *connect_thread_func(void *opaque)
>  conn->sioc = NULL;
>  }
>  
> -qemu_mutex_lock(&conn->mutex);
> -
> -assert(conn->running);
> -conn->running = false;
> -if (conn->wait_co) {
> -aio_co_schedule(NULL, conn->wait_co);
> -conn->wait_co = NULL;
> +WITH_QEMU_LOCK_GUARD(&conn->mutex) {
> +assert(conn->running);
> +conn->running = false;
> +if (conn->wait_co) {
> +aio_co_schedule(NULL, conn->wait_co);
> +conn->wait_co = NULL;
> +}
>  }
>  do_free = conn->detached;

->detached is now accessed outside the mutex

>  
> -qemu_mutex_unlock(&conn->mutex);
>  
>  if (do_free) {
>  nbd_client_connection_do_free(conn);
> @@ -136,61 +135,54 @@ void nbd_client_connection_release(NBDClientConnection 
> *conn)
>  QIOChannelSocket *coroutine_fn
>  nbd_co_establish_connection(NBDClientConnection *conn, Error **errp)
>  {
> -QIOChannelSocket *sioc = NULL;
>  QemuThread thread;
>  
> -qemu_mutex_lock(&conn->mutex);
> -
> -/*
> - * Don't call nbd_co_establish_connection() in several coroutines in
> - * parallel. Only one call at once is supported.
> - */
> -assert(!conn->wait_co);
> -
> -if (!conn->running) {
> -if (conn->sioc) {
> -/* Previous attempt finally succeeded in background */
> -sioc = g_steal_pointer(&conn->sioc);
> -qemu_mutex_unlock(&conn->mutex);
> -
> -return sioc;
> +WITH_QEMU_LOCK_GUARD(&conn->mutex) {
> +/*
> + * Don't call nbd_co_establish_connection() in several coroutines in
> + * parallel. Only one call at once is supported.
> + */
> +assert(!conn->wait_co);
> +
> +if (!conn->running) {
> +if (conn->sioc) {
> +/* Previous attempt finally succeeded in background */
> +return g_steal_pointer(&conn->sioc);
> +}
> +
> +conn->running = true;
> +error_free(conn->err);
> +conn->err = NULL;
> +qemu_thread_create(&thread, "nbd-connect",
> +   connect_thread_func, conn, 
> QEMU_THREAD_DETACHED);
>  }
>  
> -conn->running = true;
> -error_free(conn->err);
> -conn->err = NULL;
> -qemu_thread_create(&thread, "nbd-connect",
> -   connect_thread_func, conn, QEMU_THREAD_DETACHED);
> +conn->wait_co = qemu_coroutine_self();
>  }
>  
> -conn->wait_co = qemu_coroutine_self();
> -
> -qemu_mutex_unlock(&conn->mutex);
> -
>  /*
>   * We are going to wait for connect-thread finish, but
>   * nbd_co_establish_connection_cancel() can interrupt.
>   */
>  qemu_coroutine_yield();
>  
> -qemu_mutex_lock(&conn->mutex);
> -
> -if (conn->running) {
> -/*
> - * Obviously, drained section wants to start. Report the attempt as
> - * failed. Still connect thread is executing in background, and its
> - * result may be used for next connection attempt.
> - */
> -error_setg(errp, "Connection attempt cancelled by other operation");
> -} else {
> -error_propagate(errp, conn->err);
> -conn->err = NULL;
> -sioc = g_steal_pointer(&conn->sioc);
> +WITH_QEMU_LOCK_GUARD(&conn->mutex) {
> +if (conn->running) {
> +/*
> + * Obviously, drained section wants to start. Report the attempt 
> as
> + * failed. Still connect thread is executing in background, and 
> its
> + * result may be used for next connection attempt.
> + */
> +error_setg(errp, "Connection attempt cancelled by other 
> operation");
> +return NULL;
> +} else {
> +error_propagate(errp, conn->err);
> +conn->err = NULL;
> +return g_steal_pointer(&conn->sioc);
> +}
>  }
>  
> -qemu_mutex_unlock(&conn->mutex);
> -
> -return sioc;
> +abort(); /* unreachable */
>  }
>  
>  /*
> @@ -201,12 +193,10 @@ nbd_co_establish_connection(NBDClientConnection *conn, 
> Error **errp)
>   */
>  void coroutine_fn nbd_co_establish_connection_cancel(NBDClientConnection 
> *conn)
>  {
> -qemu_mutex_lock(&conn->mutex);
> +QEMU_LOCK_GUARD(&conn->mutex);
>  
>  if (conn->wait_co) {
>  aio_co_schedule(NULL, conn->wait_co);
>  conn->wait_co = NULL;
>  }
> -
> -qemu_mutex_unlock(&conn->mute

[Bug 601946] Re: [Feature request] qemu-img multi-threaded compressed image conversion

2021-04-27 Thread Thomas Huth
The QEMU project is currently considering to move its bug tracking to
another system. For this we need to know which bugs are still valid
and which could be closed already. Thus we are setting older bugs to
"Incomplete" now.

If you still think this bug report here is valid, then please switch
the state back to "New" within the next 60 days, otherwise this report
will be marked as "Expired". Or please mark it as "Fix Released" if
the problem has been solved with a newer version of QEMU already.

Thank you and sorry for the inconvenience.


** Changed in: qemu
   Status: New => Incomplete

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/601946

Title:
  [Feature request] qemu-img multi-threaded compressed image conversion

Status in QEMU:
  Incomplete

Bug description:
  Feature request:
  qemu-img multi-threaded compressed image conversion

  Suppose I want to convert raw image to compressed qcow2. Multi-
  threaded conversion will be much faster, because bottleneck is
  compressing data.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/601946/+subscriptions



Re: [PATCH RFC 0/1] To add HMP interface to dump PCI MSI-X table/PBA

2021-04-27 Thread Jason Wang



在 2021/4/28 下午1:10, Dongli Zhang 写道:

Hi Jason,

On 4/27/21 7:31 PM, Jason Wang wrote:

在 2021/4/27 下午4:53, Dr. David Alan Gilbert 写道:

* Dongli Zhang (dongli.zh...@oracle.com) wrote:

On 4/22/21 11:01 PM, Jason Wang wrote:

在 2021/4/23 下午12:47, Dongli Zhang 写道:

This is inspired by the discussion with Jason on below patchset.

https://urldefense.com/v3/__https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg09020.html__;!!GqivPVa7Brio!KbGQZW5lq3JZ60k12NuWZ6Th1lT6AwmBTF0pBgoWUKKQ4-2UhdW57PtvXUN5XQnZ2NU$


The new HMP command is introduced to dump the MSI-X table and PBA.

Initially, I was going to add new option to "info pci". However, as the
number of entries is not determined and the output of MSI-X table is much
more similar to the output of hmp_info_tlb()/hmp_info_mem(), this patch
adds interface for only HMP.

The patch is tagged with RFC because I am looking for suggestions on:

1. Is it fine to add new "info msix " command?

I wonder the reason for not simply reusing "info pci"?

The "info pci" will show PCI data for all devices and it does not accept any
argument to print for a specific device.

In addition, the "info pci" relies on qmp_query_pci(), where this patch will not
implement the interface for QMP considering the number of MSI-X entries is not
determined.

Suppose we have 10 NVMe (emulated by QEMU with default number of queues), we
will have about 600+ lines of output.

  From an HMP perspective I'm happy, so:

Acked-by: Dr. David Alan Gilbert 

but since I don't know much about MSI I'd like to see Jason's reply.


I think we'd better have more information, e.g the device can optionally report
how the MSI-X vector is used.

Virtio-pci could be the first user for this.

As discussed in another thread, you were talking about to print MSIMessage.

However, I prefer to print the raw data as I think the user of this interface
should be able to understand it as MSI-X messages.

For instance, below is the data printed by "info msix".



Just to clarify, I meant e.g for virtio-pci device, we can let it to 
print the mapping between vq and msix vectors:


vq[0].msix_vector = 0
vq[1].msix_vector = 1
config.msix_vector = 2
...

But this could be added on top if you wish.

Does this make sense?

Thanks




0xfee01004 0x 0x0022 0x
0xfee02004 0x 0x0023 0x
0xfee01004 0x 0x0023 0x
0xfee01004 0x 0x0021 0x
0xfee02004 0x 0x0022 0x
0x 0x 0x 0x0001
0x 0x 0x 0x0001

The 1st column is Message Lower Address.

The 2nd column is Message Upper Address.

The 3rd column is Message Data.

The 4th column is Vector Control.

In my opinion, this is equivalent to MSIMessage.

26 struct MSIMessage {
27 uint64_t address; --> column 1 and 2
28 uint32_t data;--> column 3
29 };


We use the similar way to read from Linux OS, e,g., given the address of MSI-X
cap, here is how we read from OS side.

# busybox devmem 0xc1001000 32
0xFEE0
# busybox devmem 0xc1001004 32
0x
# busybox devmem 0xc1001008 32
0x4049
# busybox devmem 0xc100100c 32
0x

Thank you very much!

Dongli Zhang




Adding an optional option to 'info pci' to limit to one device would be easy
though; that bit is probably easier than adding a new command.


One interesting point is that MSI could be extended for other bus, (e.g MMIO).
So "info msi" should be better I guess.



Figuring out the QMP representation of your entries might be harder -
and if this is strictly for debug, probably not worth it?


I think so.

Thanks



Dave



Dongli Zhang


2. Is there any issue with output format?

If it's not for QMP, I guess it's not a part of ABI so it should be fine.



3. Is it fine to add only for HMP, but not QMP?

I think so.

Thanks



Thank you very much!

Dongli Zhang








Re: [PATCH 01/22] qapi/parser: Don't try to handle file errors

2021-04-27 Thread Markus Armbruster
John Snow  writes:

> On 4/27/21 9:47 AM, Markus Armbruster wrote:
>> John Snow  writes:
>> 
>>> On 4/23/21 11:46 AM, Markus Armbruster wrote:
 John Snow  writes:

> The short-ish version of what motivates this patch is:
>
> - The parser initializer does not possess adequate context to write a
> good error message -- It tries to determine the caller's semantic
> context.

 I'm not sure I get what you're trying to say here.

>>>
>>> I mean: this __init__ method does not *know* who is calling it or why.
>>> Of course, *we* do, because the code base is finite and nobody else but
>>> us is calling into it.
>>>
>>> I mean to point out that the initializer has to do extra work (Just a
>>> little) to determine what the calling context is and raise an error
>>> accordingly.
>>>
>>> Example: If we have a parent info context, we raise an error in the
>>> context of the caller. If we don't, we have to create a new presumed
>>> context (using the weird None SourceInfo object).
>> 
>> I guess you mean
>> 
>>  raise QAPISemError(incl_info or QAPISourceInfo(None, None, 
>> None),
>> 
>> I can't see other instances of messing with context.
>> 
>
> Yes, and the string construction that follows, too. It's all about 
> trying to understand who our caller is and raising an error appropriate 
> for them on their behalf.

I guess you can view it that way.  I never did.  My thinking was

@fname either comes from a schema file (@incl_info is not None) or
somewhere else.  If schema file, make the exception's __str__()
start with "SCHEMA-FILE:LINE: ", because that's how compilers report
errors in source files.  Else, make it start with just "PROGNAME: ",
because that's how compilers report errors unrelated to source
files.

This assumes "incl_info is None implies unrelated to source file".  I
think that's fair.  I don't think it rises to the level of
"understanding who our caller is".

>>> So I just mean to say:
>>>
>>> "Let the caller, who unambiguously always has the exactly correct
>>> context worry about what the error message ought to be."

[...]

 Before the patch, only IOError from open() and .read() get converted to
 QAPISemError, and therefore caught by main().

 The patch widen this to anywhere in QAPISchemaParser.__init__().  Hmm.

>>>
>>> "Changed in version 3.3: EnvironmentError, IOError, WindowsError,
>>> socket.error, select.error and mmap.error have been merged into OSError,
>>> and the constructor may return a subclass."
>>>
>>>   >>> OSError == IOError
>>> True
>>>
>>> (No, I didn't know this before I wrote it. I just intentionally wanted
>>> to catch everything that open() might return, which I had simply assumed
>>> was not fully captured by IOError. Better to leave it as OSError now to
>>> avoid misleading anyone into thinking it's more narrow than it really is.)
>> 
>> Good to know.
>> 
>> However, I was talking about the code covered by try ... except OSError
>> (or IOError, or whatever).  Before the patch, it's just open() and
>> .read().  Afterwards it's all of .__init__().
>> 
>
> Apologies, I misread.
>
>> Could anything else in .__init__() possibly raise OSError?  Probably
>> not, but it's not trivially obvious.  Which makes me go "hmm."
>> 
>> "Hmm" isn't "no", it's just "hmm".
>> 
>
> Yeah, it is rather broad. That is one of the perils of doing *so much* 
> at init() time, in my opinion.

It's not ideal, but then having to write something like

parser = QAPISchemaParser(fname).parse()

or

parser = QAPISchemaParser().parse(fname)

instead of just

parser = QAPISchemaParser(fname)

would be less than ideal, too.

> We don't make any other syscalls in the parser though, so it should be 
> fine. The docstring patch later documents the errors we expect to see 
> here, so it becomes a visible part of the interface.

[...]




Re: [PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread Markus Armbruster
John Snow  writes:

> On 4/27/21 7:11 AM, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all!
>> It's a simpler alternative for
>> "[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
>><20200902181831.2570048-1-ebl...@redhat.com>
>>https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
>>https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/
>> Since we have "coroutine: true" feature for qmp commands, I think,
>> maybe we can merge allocation status to bitmap without bothering with
>> new block-job?
>> It's an RFC:
>> 1. Main question: is it OK as a simple blocking command, even in a
>> coroutine mode. It's a lot simpler, and it can be simply used in a
>> transaction with other bitmap commands.
>> 
>
> Hm, possibly... I did not follow the discussion of coroutine QMP
> commands closely to know what the qualifying criteria to use them are.
>
> (Any wisdom for me here, Markus?)

>From Kevin's cover letter:

Some QMP command handlers can block the main loop for a relatively
long time, for example because they perform some I/O.  This is quite
nasty.  Allowing such handlers to run in a coroutine where they can
yield (and therefore release the BQL) while waiting for an event
such as I/O completion solves the problem.

Running in a coroutine is not a replacement for jobs.  Monitor commands
continue to run one after the other, even with multiple monitors.  All
this does is letting monitor commands yield.

Running in a coroutine is opt-in, because we're scared of command code
misbehaving in coroutine context[*].  To opt-in, add

'coroutine': true

to the command's QAPI schema.

Misbehaving command code should be rare.  The trouble is finding it.  If
we had a better handle on that, we could make running in a coroutine
opt-out.  Watch out for nested event loops.  Test thoroughly.

Questions?

[...]

[*] Discussed at some length in patch review:

Message-ID: <874kwnvgad@dusky.pond.sub.org>
https://lists.nongnu.org/archive/html/qemu-devel/2020-01/msg05015.html




Re: [PATCH RFC 0/1] To add HMP interface to dump PCI MSI-X table/PBA

2021-04-27 Thread Dongli Zhang
Hi Jason,

On 4/27/21 7:31 PM, Jason Wang wrote:
> 
> 在 2021/4/27 下午4:53, Dr. David Alan Gilbert 写道:
>> * Dongli Zhang (dongli.zh...@oracle.com) wrote:
>>>
>>> On 4/22/21 11:01 PM, Jason Wang wrote:
 在 2021/4/23 下午12:47, Dongli Zhang 写道:
> This is inspired by the discussion with Jason on below patchset.
>
> https://urldefense.com/v3/__https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg09020.html__;!!GqivPVa7Brio!KbGQZW5lq3JZ60k12NuWZ6Th1lT6AwmBTF0pBgoWUKKQ4-2UhdW57PtvXUN5XQnZ2NU$
>
>
> The new HMP command is introduced to dump the MSI-X table and PBA.
>
> Initially, I was going to add new option to "info pci". However, as the
> number of entries is not determined and the output of MSI-X table is much
> more similar to the output of hmp_info_tlb()/hmp_info_mem(), this patch
> adds interface for only HMP.
>
> The patch is tagged with RFC because I am looking for suggestions on:
>
> 1. Is it fine to add new "info msix " command?

 I wonder the reason for not simply reusing "info pci"?
>>> The "info pci" will show PCI data for all devices and it does not accept any
>>> argument to print for a specific device.
>>>
>>> In addition, the "info pci" relies on qmp_query_pci(), where this patch 
>>> will not
>>> implement the interface for QMP considering the number of MSI-X entries is 
>>> not
>>> determined.
>>>
>>> Suppose we have 10 NVMe (emulated by QEMU with default number of queues), we
>>> will have about 600+ lines of output.
>>  From an HMP perspective I'm happy, so:
>>
>> Acked-by: Dr. David Alan Gilbert 
>>
>> but since I don't know much about MSI I'd like to see Jason's reply.
> 
> 
> I think we'd better have more information, e.g the device can optionally 
> report
> how the MSI-X vector is used.
> 
> Virtio-pci could be the first user for this.

As discussed in another thread, you were talking about to print MSIMessage.

However, I prefer to print the raw data as I think the user of this interface
should be able to understand it as MSI-X messages.

For instance, below is the data printed by "info msix".

0xfee01004 0x 0x0022 0x
0xfee02004 0x 0x0023 0x
0xfee01004 0x 0x0023 0x
0xfee01004 0x 0x0021 0x
0xfee02004 0x 0x0022 0x
0x 0x 0x 0x0001
0x 0x 0x 0x0001

The 1st column is Message Lower Address.

The 2nd column is Message Upper Address.

The 3rd column is Message Data.

The 4th column is Vector Control.

In my opinion, this is equivalent to MSIMessage.

26 struct MSIMessage {
27 uint64_t address; --> column 1 and 2
28 uint32_t data;--> column 3
29 };


We use the similar way to read from Linux OS, e,g., given the address of MSI-X
cap, here is how we read from OS side.

# busybox devmem 0xc1001000 32
0xFEE0
# busybox devmem 0xc1001004 32
0x
# busybox devmem 0xc1001008 32
0x4049
# busybox devmem 0xc100100c 32
0x

Thank you very much!

Dongli Zhang

> 
> 
>>
>> Adding an optional option to 'info pci' to limit to one device would be easy
>> though; that bit is probably easier than adding a new command.
> 
> 
> One interesting point is that MSI could be extended for other bus, (e.g MMIO).
> So "info msi" should be better I guess.
> 
> 
>> Figuring out the QMP representation of your entries might be harder -
>> and if this is strictly for debug, probably not worth it?
> 
> 
> I think so.
> 
> Thanks
> 
> 
>>
>> Dave
>>
>>
>>> Dongli Zhang
>>>

> 2. Is there any issue with output format?

 If it's not for QMP, I guess it's not a part of ABI so it should be fine.


> 3. Is it fine to add only for HMP, but not QMP?

 I think so.

 Thanks


> Thank you very much!
>
> Dongli Zhang
>
>
>
> 



Re: [PATCH v8 08/11] hw/core: deprecate old reset functions and introduce new ones

2021-04-27 Thread Markus Armbruster
Eduardo Habkost  writes:

> On Tue, Apr 27, 2021 at 02:21:28PM +0200, Philippe Mathieu-Daudé wrote:
>> On 1/23/20 2:28 PM, Damien Hedde wrote:
>> > Deprecate device_legacy_reset(), qdev_reset_all() and
>> > qbus_reset_all() to be replaced by new functions
>> > device_cold_reset() and bus_cold_reset() which uses resettable API.
>> > 
>> > Also introduce resettable_cold_reset_fn() which may be used as a
>> > replacement for qdev_reset_all_fn and qbus_reset_all_fn().
>> > 
>> > Following patches will be needed to look at legacy reset call sites
>> > and switch to resettable api. The legacy functions will be removed
>> > when unused.
>> > 
>> > Signed-off-by: Damien Hedde 
>> > Reviewed-by: Philippe Mathieu-Daudé 
>> > Reviewed-by: Peter Maydell 
>> > Reviewed-by: Richard Henderson 
>> > Tested-by: Philippe Mathieu-Daudé 
>> > ---
>> >  include/hw/qdev-core.h  | 27 +++
>> >  include/hw/resettable.h |  9 +
>> >  hw/core/bus.c   |  5 +
>> >  hw/core/qdev.c  |  5 +
>> >  hw/core/resettable.c|  5 +
>> >  5 files changed, 51 insertions(+)
>> > 
>> > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
>> > index 1b4b420617..b84fcc32bf 100644
>> > --- a/include/hw/qdev-core.h
>> > +++ b/include/hw/qdev-core.h
>> > @@ -406,6 +406,13 @@ int qdev_walk_children(DeviceState *dev,
>> > qdev_walkerfn *post_devfn, qbus_walkerfn 
>> > *post_busfn,
>> > void *opaque);
>> >  
>> > +/**
>> > + * @qdev_reset_all:
>> > + * Reset @dev. See @qbus_reset_all() for more details.
>> > + *
>> > + * Note: This function is deprecated and will be removed when it becomes 
>> > unused.
>> > + * Please use device_cold_reset() now.
>> > + */
>> >  void qdev_reset_all(DeviceState *dev);
>> >  void qdev_reset_all_fn(void *opaque);
>> >  
>> > @@ -418,10 +425,28 @@ void qdev_reset_all_fn(void *opaque);
>> >   * hard reset means that qbus_reset_all will reset all state of the 
>> > device.
>> >   * For PCI devices, for example, this will include the base address 
>> > registers
>> >   * or configuration space.
>> > + *
>> > + * Note: This function is deprecated and will be removed when it becomes 
>> > unused.
>> > + * Please use bus_cold_reset() now.
>> 
>> Some time passed, so looking at this with some retrospective.
>> 
>> If there is an effort to introduce a new API replacing another one,
>> we should try convert all the uses of the old API to the new one,
>> instead of declaring it legacy.
>> 
>> Declare an API legacy/deprecated should be the last resort if there
>> is no way to remove it. I'd recommend to move the deprecated/legacy
>> declarations in a separate header, with the '_legacy' suffix.
>> 
>> Else:
>> 
>> 1/ we never finish API conversions,
>> 2/ the new API might not be ready for all the legacy API use cases,
>> 3/ we end up having to maintain 2 different APIs.
>> 
>> 
>> So the recommendation is to use bus_cold_reset(), but it isn't
>> used anywhere...:
>> 
>> $ git grep bus_cold_reset
>> docs/devel/reset.rst:64:- ``bus_cold_reset()``
>> hw/core/bus.c:73:void bus_cold_reset(BusState *bus)
>> include/hw/qdev-core.h:715: * Please use bus_cold_reset() now.
>> include/hw/qdev-core.h:728: * bus_cold_reset:
>> include/hw/qdev-core.h:733:void bus_cold_reset(BusState *bus);
>> 
>> IMHO we shouldn't add new public prototypes without callers.
>
> I agree.  We should make at least some effort to convert code to
> the new API, if only to serve as reference for people doing the
> conversion.  I'm surprised that a new function was added more
> than a year ago and nobody is using it.
>
> What happened here?  Was there some plan to convert existing code
> but it was abandoned?

Commit abb89dbf2 introduced bus_cold_reset() and device_cold_reset().
It was posted as part of "[PATCH v8 00/11] Multi-phase reset mechanism".
The series did not add any users.  The cover letter explains:

The purpose of this series is to split the current reset procedure
into multiple phases. This will help to solve some ordering
difficulties we have during reset.

This is a ready to merge version. I've taken the few remarks of
Philippe about v7 in account. Thanks to him for all the tests he did.

This series adds resettable interface and transitions base Device and
Bus classes (sysbus subclasses are ok too). It provides new reset
functions but does not switch anymore the old functions
(device_reset() and qdev/qbus_reset_all()) to resettable interface.
These functions keep the exact same behavior as before.

The series also transition the main reset handlers registration which
has no impact until devices and buses are transitioned.

The series is organized as follows:
Patch 1 prepare the reset transition. Patch 2 adds some utility trace
events. Patches 3 to 8 adds resettable api in devices and buses. Patch
9 adds some documentation. Patches 10 and 11 transition the call sites
of qemu_re

[PATCH v8 5/6] [RISCV_PM] Implement address masking functions required for RISC-V Pointer Masking extension

2021-04-27 Thread Alexey Baturo
From: Anatoly Parshintsev 

Signed-off-by: Anatoly Parshintsev 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu.h   | 20 
 target/riscv/translate.c | 36 ++--
 2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 19aa3b4769..2edfc59712 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -407,6 +407,8 @@ FIELD(TB_FLAGS, SEW, 5, 3)
 FIELD(TB_FLAGS, VILL, 8, 1)
 /* Is a Hypervisor instruction load/store allowed? */
 FIELD(TB_FLAGS, HLSX, 9, 1)
+/* If PointerMasking should be applied */
+FIELD(TB_FLAGS, PM_ENABLED, 10, 1)
 
 bool riscv_cpu_is_32bit(CPURISCVState *env);
 
@@ -464,6 +466,24 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
*env, target_ulong *pc,
 flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1);
 }
 }
+if (riscv_has_ext(env, RVJ)) {
+int priv = cpu_mmu_index(env, false);
+bool pm_enabled = false;
+switch (priv) {
+case PRV_U:
+pm_enabled = env->mmte & U_PM_ENABLE;
+break;
+case PRV_S:
+pm_enabled = env->mmte & S_PM_ENABLE;
+break;
+case PRV_M:
+pm_enabled = env->mmte & M_PM_ENABLE;
+break;
+default:
+g_assert_not_reached();
+}
+flags = FIELD_DP32(flags, TB_FLAGS, PM_ENABLED, pm_enabled);
+}
 #endif
 
 *pflags = flags;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 2e815a5912..37706d56d5 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -36,6 +36,9 @@ static TCGv cpu_gpr[32], cpu_pc, cpu_vl;
 static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
 static TCGv load_res;
 static TCGv load_val;
+/* globals for PM CSRs */
+static TCGv pm_mask[4];
+static TCGv pm_base[4];
 
 #include "exec/gen-icount.h"
 
@@ -64,6 +67,10 @@ typedef struct DisasContext {
 uint16_t vlen;
 uint16_t mlen;
 bool vl_eq_vlmax;
+/* PointerMasking extension */
+bool pm_enabled;
+TCGv pm_mask;
+TCGv pm_base;
 CPUState *cs;
 } DisasContext;
 
@@ -90,13 +97,19 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
 }
 
 /*
- * Temp stub: generates address adjustment for PointerMasking
+ * Generates address adjustment for PointerMasking
  */
 static void gen_pm_adjust_address(DisasContext *s,
   TCGv_i64  dst,
   TCGv_i64  src)
 {
-tcg_gen_mov_i64(dst, src);
+if (!s->pm_enabled) {
+/* Load unmodified address */
+tcg_gen_mov_i64(dst, src);
+} else {
+tcg_gen_andc_i64(dst, src, s->pm_mask);
+tcg_gen_or_i64(dst, dst, s->pm_base);
+}
 }
 
 /*
@@ -657,6 +670,10 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->lmul = FIELD_EX32(tb_flags, TB_FLAGS, LMUL);
 ctx->mlen = 1 << (ctx->sew  + 3 - ctx->lmul);
 ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
+ctx->pm_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_ENABLED);
+int priv = cpu_mmu_index(env, false) & TB_FLAGS_PRIV_MMU_MASK;
+ctx->pm_mask = pm_mask[priv];
+ctx->pm_base = pm_base[priv];
 ctx->cs = cs;
 }
 
@@ -777,4 +794,19 @@ void riscv_translate_init(void)
  "load_res");
 load_val = tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, load_val),
  "load_val");
+#ifndef CONFIG_USER_ONLY
+/* Assign PM CSRs to tcg globals */
+pm_mask[PRV_U] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, upmmask), "upmmask");
+pm_base[PRV_U] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, upmbase), "upmbase");
+pm_mask[PRV_S] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, spmmask), "spmmask");
+pm_base[PRV_S] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, spmbase), "spmbase");
+pm_mask[PRV_M] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, mpmmask), "mpmmask");
+pm_base[PRV_M] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, mpmbase), "mpmbase");
+#endif
 }
-- 
2.20.1




[PATCH v8 3/6] [RISCV_PM] Print new PM CSRs in QEMU logs

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index c04911ec05..0682410f5d 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -287,6 +287,31 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, 
int flags)
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "htval ", env->htval);
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mtval2 ", env->mtval2);
 }
+if (riscv_has_ext(env, RVJ)) {
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mmte", env->mmte);
+switch (env->priv) {
+case PRV_U:
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "upmbase ",
+ env->upmbase);
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "upmmask ",
+ env->upmmask);
+break;
+case PRV_S:
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "spmbase ",
+ env->spmbase);
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "spmmask ",
+ env->spmmask);
+break;
+case PRV_M:
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mpmbase ",
+ env->mpmbase);
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mpmmask ",
+ env->mpmmask);
+break;
+default:
+g_assert_not_reached();
+}
+}
 #endif
 
 for (i = 0; i < 32; i++) {
-- 
2.20.1




[PATCH v8 6/6] [RISCV_PM] Allow experimental J-ext to be turned on

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 0682410f5d..fecc64d7ba 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -502,6 +502,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 #ifndef CONFIG_USER_ONLY
 env->mmte |= PM_EXT_INITIAL;
 #endif
+target_misa |= RVJ;
 }
 if (cpu->cfg.ext_v) {
 target_misa |= RVV;
@@ -574,6 +575,7 @@ static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
 /* This is experimental so mark with 'x-' */
 DEFINE_PROP_BOOL("x-h", RISCVCPU, cfg.ext_h, false),
+DEFINE_PROP_BOOL("x-j", RISCVCPU, cfg.ext_j, false),
 DEFINE_PROP_BOOL("x-v", RISCVCPU, cfg.ext_v, false),
 DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
 DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
-- 
2.20.1




[PATCH v8 1/6] [RISCV_PM] Add J-extension into RISC-V

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0a33d387ba..0ea9fc65c8 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -72,6 +72,7 @@
 #define RVS RV('S')
 #define RVU RV('U')
 #define RVH RV('H')
+#define RVJ RV('J')
 
 /* S extension denotes that Supervisor mode exists, however it is possible
to have a core that support S mode but does not have an MMU and there
@@ -291,6 +292,7 @@ struct RISCVCPU {
 bool ext_s;
 bool ext_u;
 bool ext_h;
+bool ext_j;
 bool ext_v;
 bool ext_counters;
 bool ext_ifencei;
-- 
2.20.1




[PATCH v8 4/6] [RISCV_PM] Support pointer masking for RISC-V for i/c/f/d/a types of instructions

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/insn_trans/trans_rva.c.inc |  3 +++
 target/riscv/insn_trans/trans_rvd.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvf.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvi.c.inc |  2 ++
 target/riscv/translate.c| 10 ++
 5 files changed, 19 insertions(+)

diff --git a/target/riscv/insn_trans/trans_rva.c.inc 
b/target/riscv/insn_trans/trans_rva.c.inc
index be8a9f06dd..5559e347ba 100644
--- a/target/riscv/insn_trans/trans_rva.c.inc
+++ b/target/riscv/insn_trans/trans_rva.c.inc
@@ -26,6 +26,7 @@ static inline bool gen_lr(DisasContext *ctx, arg_atomic *a, 
MemOp mop)
 if (a->rl) {
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
 }
+gen_pm_adjust_address(ctx, src1, src1);
 tcg_gen_qemu_ld_tl(load_val, src1, ctx->mem_idx, mop);
 if (a->aq) {
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
@@ -46,6 +47,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, 
MemOp mop)
 TCGLabel *l2 = gen_new_label();
 
 gen_get_gpr(src1, a->rs1);
+gen_pm_adjust_address(ctx, src1, src1);
 tcg_gen_brcond_tl(TCG_COND_NE, load_res, src1, l1);
 
 gen_get_gpr(src2, a->rs2);
@@ -91,6 +93,7 @@ static bool gen_amo(DisasContext *ctx, arg_atomic *a,
 gen_get_gpr(src1, a->rs1);
 gen_get_gpr(src2, a->rs2);
 
+gen_pm_adjust_address(ctx, src1, src1);
 (*func)(src2, src1, src2, ctx->mem_idx, mop);
 
 gen_set_gpr(a->rd, src2);
diff --git a/target/riscv/insn_trans/trans_rvd.c.inc 
b/target/riscv/insn_trans/trans_rvd.c.inc
index 4f832637fa..935342f66d 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -25,6 +25,7 @@ static bool trans_fld(DisasContext *ctx, arg_fld *a)
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEQ);
 
@@ -40,6 +41,7 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEQ);
 
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc 
b/target/riscv/insn_trans/trans_rvf.c.inc
index 3dfec8211d..04b3c3eb3d 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -30,6 +30,7 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a)
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEUL);
 gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
@@ -47,6 +48,7 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
 gen_get_gpr(t0, a->rs1);
 
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEUL);
 
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc 
b/target/riscv/insn_trans/trans_rvi.c.inc
index d04ca0394c..bee7f6be46 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -141,6 +141,7 @@ static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp 
memop)
 TCGv t1 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, memop);
 gen_set_gpr(a->rd, t1);
@@ -180,6 +181,7 @@ static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp 
memop)
 TCGv dat = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 gen_get_gpr(dat, a->rs2);
 
 tcg_gen_qemu_st_tl(dat, t0, ctx->mem_idx, memop);
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 2f9f5ccc62..2e815a5912 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -89,6 +89,16 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
 tcg_gen_ori_i64(out, in, MAKE_64BIT_MASK(32, 32));
 }
 
+/*
+ * Temp stub: generates address adjustment for PointerMasking
+ */
+static void gen_pm_adjust_address(DisasContext *s,
+  TCGv_i64  dst,
+  TCGv_i64  src)
+{
+tcg_gen_mov_i64(dst, src);
+}
+
 /*
  * A narrow n-bit operation, where n < FLEN, checks that input operands
  * are correctly Nan-boxed, i.e., all upper FLEN - n bits are 1.
-- 
2.20.1




[PATCH RESEND v8 2/6] [RISCV_PM] Support CSRs required for RISC-V PM extension except for the h-mode

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
---
resend:
  minor codestyle fix

 target/riscv/cpu.c  |   5 +
 target/riscv/cpu.h  |  12 ++
 target/riscv/cpu_bits.h |  66 +++
 target/riscv/csr.c  | 240 
 4 files changed, 323 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 7d6ed80f6b..c04911ec05 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -473,6 +473,11 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 if (cpu->cfg.ext_h) {
 target_misa |= RVH;
 }
+if (cpu->cfg.ext_j) {
+#ifndef CONFIG_USER_ONLY
+env->mmte |= PM_EXT_INITIAL;
+#endif
+}
 if (cpu->cfg.ext_v) {
 target_misa |= RVV;
 if (!is_power_of_2(cpu->cfg.vlen)) {
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0ea9fc65c8..19aa3b4769 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -238,6 +238,18 @@ struct CPURISCVState {
 
 /* True if in debugger mode.  */
 bool debugger;
+
+/*
+ * CSRs for PM
+ * TODO: move these csr to appropriate groups
+ */
+target_ulong mmte;
+target_ulong mpmmask;
+target_ulong mpmbase;
+target_ulong spmmask;
+target_ulong spmbase;
+target_ulong upmmask;
+target_ulong upmbase;
 #endif
 
 float_status fp_status;
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index caf4599207..f8e7cdb99b 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -354,6 +354,21 @@
 #define CSR_MHPMCOUNTER30H  0xb9e
 #define CSR_MHPMCOUNTER31H  0xb9f
 
+/* Custom user register */
+#define CSR_UMTE0x8c0
+#define CSR_UPMMASK 0x8c1
+#define CSR_UPMBASE 0x8c2
+
+/* Custom machine register */
+#define CSR_MMTE0x7c0
+#define CSR_MPMMASK 0x7c1
+#define CSR_MPMBASE 0x7c2
+
+/* Custom supervisor register */
+#define CSR_SMTE0x9c0
+#define CSR_SPMMASK 0x9c1
+#define CSR_SPMBASE 0x9c2
+
 /* Legacy Machine Protection and Translation (priv v1.9.1) */
 #define CSR_MBASE   0x380
 #define CSR_MBOUND  0x381
@@ -592,4 +607,55 @@
 #define MIE_UTIE   (1 << IRQ_U_TIMER)
 #define MIE_SSIE   (1 << IRQ_S_SOFT)
 #define MIE_USIE   (1 << IRQ_U_SOFT)
+
+/* general mte CSR bits*/
+#define PM_ENABLE   0x0001ULL
+#define PM_CURRENT  0x0002ULL
+#define PM_XS_MASK  0x0003ULL
+
+/* PM XS bits values */
+#define PM_EXT_DISABLE  0xULL
+#define PM_EXT_INITIAL  0x0001ULL
+#define PM_EXT_CLEAN0x0002ULL
+#define PM_EXT_DIRTY0x0003ULL
+
+/* offsets for every pair of control bits per each priv level */
+#define XS_OFFSET0ULL
+#define U_OFFSET 2ULL
+#define S_OFFSET 4ULL
+#define M_OFFSET 6ULL
+
+#define PM_XS_BITS   (PM_XS_MASK << XS_OFFSET)
+#define U_PM_ENABLE  (PM_ENABLE  << U_OFFSET)
+#define U_PM_CURRENT (PM_CURRENT << U_OFFSET)
+#define S_PM_ENABLE  (PM_ENABLE  << S_OFFSET)
+#define S_PM_CURRENT (PM_CURRENT << S_OFFSET)
+#define M_PM_ENABLE  (PM_ENABLE  << M_OFFSET)
+
+/* mmte CSR bits */
+#define MMTE_PM_XS_BITS PM_XS_BITS
+#define MMTE_U_PM_ENABLEU_PM_ENABLE
+#define MMTE_U_PM_CURRENT   U_PM_CURRENT
+#define MMTE_S_PM_ENABLES_PM_ENABLE
+#define MMTE_S_PM_CURRENT   S_PM_CURRENT
+#define MMTE_M_PM_ENABLEM_PM_ENABLE
+#define MMTE_MASK   (MMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT | \
+ MMTE_S_PM_ENABLE | MMTE_S_PM_CURRENT | \
+ MMTE_M_PM_ENABLE | MMTE_PM_XS_BITS)
+
+/* smte CSR bits */
+#define SMTE_PM_XS_BITS PM_XS_BITS
+#define SMTE_U_PM_ENABLEU_PM_ENABLE
+#define SMTE_U_PM_CURRENT   U_PM_CURRENT
+#define SMTE_S_PM_ENABLES_PM_ENABLE
+#define SMTE_S_PM_CURRENT   S_PM_CURRENT
+#define SMTE_MASK   (SMTE_U_PM_ENABLE | SMTE_U_PM_CURRENT | \
+ SMTE_S_PM_ENABLE | SMTE_S_PM_CURRENT | \
+ SMTE_PM_XS_BITS)
+
+/* umte CSR bits */
+#define UMTE_U_PM_ENABLEU_PM_ENABLE
+#define UMTE_U_PM_CURRENT   U_PM_CURRENT
+#define UMTE_MASK   (UMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT)
+
 #endif
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index d2585395bf..bef65c5ae1 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -184,6 +184,39 @@ static int hmode32(CPURISCVState *env, int csrno)
 
 }
 
+static int umode(CPURISCVState *env, int csrno)
+{
+return -!riscv_has_ext(env, RVU);
+}
+
+/* Checks if PointerMasking registers could be accessed */
+static int pointer_masking(CPURISCVState *env, int csrno)
+{
+/* Check if j-ext is present */
+int j_check = -!riscv_has_ext(env, RVJ);
+int mode_check = 0;
+int csr_priv = get_field(csrno, 0x300);
+/* check if particular mode is present */
+switch (csr_priv) {
+case PRV_M:
+mode_check = any(env, csrno);
+break;
+case PRV_S:
+

[PATCH RESEND v8 0/6] RISC-V Pointer Masking implementation

2021-04-27 Thread Alexey Baturo
v8-resend:
Resending to trigger recheck due to minor codestyle issues.

v8:
Hi folks,

Finally we were able to assign v0.1 draft for Pointer Masking extension for 
RISC-V: 
https://github.com/riscv/riscv-j-extension/blob/master/pointer-masking-proposal.adoc
This is supposed to be the first series of patches with initial support for PM. 
It obviously misses support for hypervisor mode, vector load/stores and some 
other features, while using temporary csr numbers(they're to be assigned by the 
committee a bit later).
With this patch series we were able to run a bunch of tests with HWASAN checks 
enabled.

I hope I've managed to addressed @Alistair's previous comments in this version.

Thanks!

v7:
Hi folks,

Sorry it took me almost 3 month to provide the reply and fixes: it was a really 
busy EOY.
This series contains fixed @Alistair suggestion on enabling J-ext.

As for @Richard comments:
- Indeed I've missed appending review-by to the approved commits. Now I've 
restored them except for the fourth commit. @Richard could you please tell if 
you think it's still ok to commit it as is, or should I support masking mem ops 
for RVV first?
- These patches don't have any support for load/store masking for RVV and RVH 
extensions, so no support for special load/store for Hypervisor in particular.

If this patch series would be accepted, I think my further attention would be 
to:
- Support pm for memory operations for RVV
- Add proper csr and support pm for memory operations for Hypervisor mode
- Support address wrapping on unaligned accesses as @Richard mentioned 
previously

Thanks!

Alexey Baturo (5):
  [RISCV_PM] Add J-extension into RISC-V
  [RISCV_PM] Support CSRs required for RISC-V PM extension except for
the h-mode
  [RISCV_PM] Print new PM CSRs in QEMU logs
  [RISCV_PM] Support pointer masking for RISC-V for i/c/f/d/a types of
instructions
  [RISCV_PM] Allow experimental J-ext to be turned on

Anatoly Parshintsev (1):
  [RISCV_PM] Implement address masking functions required for RISC-V
Pointer Masking extension

 target/riscv/cpu.c  |  32 
 target/riscv/cpu.h  |  34 
 target/riscv/cpu_bits.h |  66 +++
 target/riscv/csr.c  | 236 
 target/riscv/insn_trans/trans_rva.c.inc |   3 +
 target/riscv/insn_trans/trans_rvd.c.inc |   2 +
 target/riscv/insn_trans/trans_rvf.c.inc |   2 +
 target/riscv/insn_trans/trans_rvi.c.inc |   2 +
 target/riscv/translate.c|  42 +
 9 files changed, 419 insertions(+)

-- 
2.20.1




[PATCH RESEND v8 0/6] RISC-V Pointer Masking implementation

2021-04-27 Thread Alexey Baturo
v8-resend:
Resending to trigger recheck due to minor codestyle issues.

v8:
Hi folks,

Finally we were able to assign v0.1 draft for Pointer Masking extension for 
RISC-V: 
https://github.com/riscv/riscv-j-extension/blob/master/pointer-masking-proposal.adoc
This is supposed to be the first series of patches with initial support for PM. 
It obviously misses support for hypervisor mode, vector load/stores and some 
other features, while using temporary csr numbers(they're to be assigned by the 
committee a bit later).
With this patch series we were able to run a bunch of tests with HWASAN checks 
enabled.

I hope I've managed to addressed @Alistair's previous comments in this version.

Thanks!

v7:
Hi folks,

Sorry it took me almost 3 month to provide the reply and fixes: it was a really 
busy EOY.
This series contains fixed @Alistair suggestion on enabling J-ext.

As for @Richard comments:
- Indeed I've missed appending review-by to the approved commits. Now I've 
restored them except for the fourth commit. @Richard could you please tell if 
you think it's still ok to commit it as is, or should I support masking mem ops 
for RVV first?
- These patches don't have any support for load/store masking for RVV and RVH 
extensions, so no support for special load/store for Hypervisor in particular.

If this patch series would be accepted, I think my further attention would be 
to:
- Support pm for memory operations for RVV
- Add proper csr and support pm for memory operations for Hypervisor mode
- Support address wrapping on unaligned accesses as @Richard mentioned 
previously

Thanks!

Alexey Baturo (5):
  [RISCV_PM] Add J-extension into RISC-V
  [RISCV_PM] Support CSRs required for RISC-V PM extension except for
the h-mode
  [RISCV_PM] Print new PM CSRs in QEMU logs
  [RISCV_PM] Support pointer masking for RISC-V for i/c/f/d/a types of
instructions
  [RISCV_PM] Allow experimental J-ext to be turned on

Anatoly Parshintsev (1):
  [RISCV_PM] Implement address masking functions required for RISC-V
Pointer Masking extension

 target/riscv/cpu.c  |  32 
 target/riscv/cpu.h  |  34 
 target/riscv/cpu_bits.h |  66 +++
 target/riscv/csr.c  | 236 
 target/riscv/insn_trans/trans_rva.c.inc |   3 +
 target/riscv/insn_trans/trans_rvd.c.inc |   2 +
 target/riscv/insn_trans/trans_rvf.c.inc |   2 +
 target/riscv/insn_trans/trans_rvi.c.inc |   2 +
 target/riscv/translate.c|  42 +
 9 files changed, 419 insertions(+)

-- 
2.20.1




[PATCH RESEND v8 2/6] [RISCV_PM] Support CSRs required for RISC-V PM extension except for the h-mode

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
---
resend:
  minor codestyle fix

 target/riscv/cpu.c  |   5 +
 target/riscv/cpu.h  |  12 ++
 target/riscv/cpu_bits.h |  66 +++
 target/riscv/csr.c  | 240 
 4 files changed, 323 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 7d6ed80f6b..c04911ec05 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -473,6 +473,11 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 if (cpu->cfg.ext_h) {
 target_misa |= RVH;
 }
+if (cpu->cfg.ext_j) {
+#ifndef CONFIG_USER_ONLY
+env->mmte |= PM_EXT_INITIAL;
+#endif
+}
 if (cpu->cfg.ext_v) {
 target_misa |= RVV;
 if (!is_power_of_2(cpu->cfg.vlen)) {
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0ea9fc65c8..19aa3b4769 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -238,6 +238,18 @@ struct CPURISCVState {
 
 /* True if in debugger mode.  */
 bool debugger;
+
+/*
+ * CSRs for PM
+ * TODO: move these csr to appropriate groups
+ */
+target_ulong mmte;
+target_ulong mpmmask;
+target_ulong mpmbase;
+target_ulong spmmask;
+target_ulong spmbase;
+target_ulong upmmask;
+target_ulong upmbase;
 #endif
 
 float_status fp_status;
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index caf4599207..f8e7cdb99b 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -354,6 +354,21 @@
 #define CSR_MHPMCOUNTER30H  0xb9e
 #define CSR_MHPMCOUNTER31H  0xb9f
 
+/* Custom user register */
+#define CSR_UMTE0x8c0
+#define CSR_UPMMASK 0x8c1
+#define CSR_UPMBASE 0x8c2
+
+/* Custom machine register */
+#define CSR_MMTE0x7c0
+#define CSR_MPMMASK 0x7c1
+#define CSR_MPMBASE 0x7c2
+
+/* Custom supervisor register */
+#define CSR_SMTE0x9c0
+#define CSR_SPMMASK 0x9c1
+#define CSR_SPMBASE 0x9c2
+
 /* Legacy Machine Protection and Translation (priv v1.9.1) */
 #define CSR_MBASE   0x380
 #define CSR_MBOUND  0x381
@@ -592,4 +607,55 @@
 #define MIE_UTIE   (1 << IRQ_U_TIMER)
 #define MIE_SSIE   (1 << IRQ_S_SOFT)
 #define MIE_USIE   (1 << IRQ_U_SOFT)
+
+/* general mte CSR bits*/
+#define PM_ENABLE   0x0001ULL
+#define PM_CURRENT  0x0002ULL
+#define PM_XS_MASK  0x0003ULL
+
+/* PM XS bits values */
+#define PM_EXT_DISABLE  0xULL
+#define PM_EXT_INITIAL  0x0001ULL
+#define PM_EXT_CLEAN0x0002ULL
+#define PM_EXT_DIRTY0x0003ULL
+
+/* offsets for every pair of control bits per each priv level */
+#define XS_OFFSET0ULL
+#define U_OFFSET 2ULL
+#define S_OFFSET 4ULL
+#define M_OFFSET 6ULL
+
+#define PM_XS_BITS   (PM_XS_MASK << XS_OFFSET)
+#define U_PM_ENABLE  (PM_ENABLE  << U_OFFSET)
+#define U_PM_CURRENT (PM_CURRENT << U_OFFSET)
+#define S_PM_ENABLE  (PM_ENABLE  << S_OFFSET)
+#define S_PM_CURRENT (PM_CURRENT << S_OFFSET)
+#define M_PM_ENABLE  (PM_ENABLE  << M_OFFSET)
+
+/* mmte CSR bits */
+#define MMTE_PM_XS_BITS PM_XS_BITS
+#define MMTE_U_PM_ENABLEU_PM_ENABLE
+#define MMTE_U_PM_CURRENT   U_PM_CURRENT
+#define MMTE_S_PM_ENABLES_PM_ENABLE
+#define MMTE_S_PM_CURRENT   S_PM_CURRENT
+#define MMTE_M_PM_ENABLEM_PM_ENABLE
+#define MMTE_MASK   (MMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT | \
+ MMTE_S_PM_ENABLE | MMTE_S_PM_CURRENT | \
+ MMTE_M_PM_ENABLE | MMTE_PM_XS_BITS)
+
+/* smte CSR bits */
+#define SMTE_PM_XS_BITS PM_XS_BITS
+#define SMTE_U_PM_ENABLEU_PM_ENABLE
+#define SMTE_U_PM_CURRENT   U_PM_CURRENT
+#define SMTE_S_PM_ENABLES_PM_ENABLE
+#define SMTE_S_PM_CURRENT   S_PM_CURRENT
+#define SMTE_MASK   (SMTE_U_PM_ENABLE | SMTE_U_PM_CURRENT | \
+ SMTE_S_PM_ENABLE | SMTE_S_PM_CURRENT | \
+ SMTE_PM_XS_BITS)
+
+/* umte CSR bits */
+#define UMTE_U_PM_ENABLEU_PM_ENABLE
+#define UMTE_U_PM_CURRENT   U_PM_CURRENT
+#define UMTE_MASK   (UMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT)
+
 #endif
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index d2585395bf..bef65c5ae1 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -184,6 +184,39 @@ static int hmode32(CPURISCVState *env, int csrno)
 
 }
 
+static int umode(CPURISCVState *env, int csrno)
+{
+return -!riscv_has_ext(env, RVU);
+}
+
+/* Checks if PointerMasking registers could be accessed */
+static int pointer_masking(CPURISCVState *env, int csrno)
+{
+/* Check if j-ext is present */
+int j_check = -!riscv_has_ext(env, RVJ);
+int mode_check = 0;
+int csr_priv = get_field(csrno, 0x300);
+/* check if particular mode is present */
+switch (csr_priv) {
+case PRV_M:
+mode_check = any(env, csrno);
+break;
+case PRV_S:
+

[PATCH 2/3] linux-user/s390x: Clean up signal.c

2021-04-27 Thread Richard Henderson
The "save" routines from the kernel, which are currently
commented out, are unnecessary in qemu.  We can copy from
env where the kernel needs special instructions.

Drop the return value from restore_sigregs, as it cannot fail.
Use __get_user return as input to trace, so that we properly
bswap the value for the host.

Reorder the function bodies to correspond to the kernel source.
Fix the use of host addresses where guest addresses are needed.
Drop the use of PSW_ADDR_AMODE, since we only support 64-bit.
Set psw.mask properly for the signal handler.
Use tswap_sigset in setup_rt_frame.

Signed-off-by: Richard Henderson 
---
 linux-user/s390x/signal.c | 184 ++
 1 file changed, 89 insertions(+), 95 deletions(-)

diff --git a/linux-user/s390x/signal.c b/linux-user/s390x/signal.c
index e5bc4f0358..fb7065f243 100644
--- a/linux-user/s390x/signal.c
+++ b/linux-user/s390x/signal.c
@@ -32,7 +32,6 @@
 #define _SIGCONTEXT_NSIG_BPW64 /* FIXME: 31-bit mode -> 32 */
 #define _SIGCONTEXT_NSIG_WORDS  (_SIGCONTEXT_NSIG / _SIGCONTEXT_NSIG_BPW)
 #define _SIGMASK_COPY_SIZE(sizeof(unsigned long)*_SIGCONTEXT_NSIG_WORDS)
-#define PSW_ADDR_AMODE0xUL /* 0x8000UL for 
31-bit */
 #define S390_SYSCALL_OPCODE ((uint16_t)0x0a00)
 
 typedef struct {
@@ -106,23 +105,25 @@ get_sigframe(struct target_sigaction *ka, CPUS390XState 
*env, size_t frame_size)
 static void save_sigregs(CPUS390XState *env, target_sigregs *sregs)
 {
 int i;
-//save_access_regs(current->thread.acrs); FIXME
 
-/* Copy a 'clean' PSW mask to the user to avoid leaking
-   information about whether PER is currently on.  */
+/*
+ * Copy a 'clean' PSW mask to the user to avoid leaking
+ * information about whether PER is currently on.
+ */
 __put_user(env->psw.mask, &sregs->regs.psw.mask);
 __put_user(env->psw.addr, &sregs->regs.psw.addr);
+
 for (i = 0; i < 16; i++) {
 __put_user(env->regs[i], &sregs->regs.gprs[i]);
 }
 for (i = 0; i < 16; i++) {
 __put_user(env->aregs[i], &sregs->regs.acrs[i]);
 }
+
 /*
  * We have to store the fp registers to current->thread.fp_regs
  * to merge them with the emulated registers.
  */
-//save_fp_regs(¤t->thread.fp_regs); FIXME
 for (i = 0; i < 16; i++) {
 __put_user(*get_freg(env, i), &sregs->fpregs.fprs[i]);
 }
@@ -137,120 +138,124 @@ void setup_frame(int sig, struct target_sigaction *ka,
 frame_addr = get_sigframe(ka, env, sizeof(*frame));
 trace_user_setup_frame(env, frame_addr);
 if (!lock_user_struct(VERIFY_WRITE, frame, frame_addr, 0)) {
-goto give_sigsegv;
-}
-
-__put_user(set->sig[0], &frame->sc.oldmask[0]);
-
-save_sigregs(env, &frame->sregs);
-
-__put_user((abi_ulong)(unsigned long)&frame->sregs,
-   (abi_ulong *)&frame->sc.sregs);
-
-/* Set up to return from userspace.  If provided, use a stub
-   already in userspace.  */
-if (ka->sa_flags & TARGET_SA_RESTORER) {
-env->regs[14] = (unsigned long)
-ka->sa_restorer | PSW_ADDR_AMODE;
-} else {
-env->regs[14] = (frame_addr + offsetof(sigframe, retcode))
-| PSW_ADDR_AMODE;
-__put_user(S390_SYSCALL_OPCODE | TARGET_NR_sigreturn,
-   (uint16_t *)(frame->retcode));
+force_sigsegv(sig);
+return;
 }
 
 /* Set up backchain. */
 __put_user(env->regs[15], (abi_ulong *) frame);
 
+/* Create struct sigcontext on the signal stack. */
+for (int i = 0; i < ARRAY_SIZE(frame->sc.oldmask); ++i) {
+__put_user(set->sig[i], &frame->sc.oldmask[i]);
+}
+
+save_sigregs(env, &frame->sregs);
+__put_user(frame_addr + offsetof(sigframe, sregs), &frame->sc.sregs);
+
+/* Place signal number on stack to allow backtrace from handler.  */
+__put_user(sig, &frame->signo);
+
+/*
+ * Set up to return from userspace.
+ * If provided, use a stub already in userspace.
+ */
+if (ka->sa_flags & TARGET_SA_RESTORER) {
+env->regs[14] = ka->sa_restorer;
+} else {
+env->regs[14] = frame_addr + offsetof(sigframe, retcode);
+__put_user(S390_SYSCALL_OPCODE | TARGET_NR_sigreturn,
+   (uint16_t *)(frame->retcode));
+}
+
 /* Set up registers for signal handler */
 env->regs[15] = frame_addr;
-env->psw.addr = (target_ulong) ka->_sa_handler | PSW_ADDR_AMODE;
+
+/* Force default amode and default user address space control. */
+env->psw.mask = PSW_MASK_64 | PSW_MASK_32 | PSW_ASC_PRIMARY
+  | (env->psw.mask & ~PSW_MASK_ASC);
+env->psw.addr = ka->_sa_handler;
 
 env->regs[2] = sig; //map_signal(sig);
 env->regs[3] = frame_addr += offsetof(typeof(*frame), sc);
 
-/* We forgot to include these in the sigcontext.
-   To avoid breaking binary compatibility, they are passed as args. */
-env->regs[4] = 0; // FIXME: no clue... curre

[PATCH 3/3] linux-user/s390x: Handle vector regs in signal stack

2021-04-27 Thread Richard Henderson
Signed-off-by: Richard Henderson 
---
 linux-user/s390x/signal.c | 62 +--
 1 file changed, 60 insertions(+), 2 deletions(-)

diff --git a/linux-user/s390x/signal.c b/linux-user/s390x/signal.c
index fb7065f243..57752a2a96 100644
--- a/linux-user/s390x/signal.c
+++ b/linux-user/s390x/signal.c
@@ -51,6 +51,12 @@ typedef struct {
 target_s390_fp_regs fpregs;
 } target_sigregs;
 
+typedef struct {
+uint64_t vxrs_low[16];
+uint64_t vxrs_high[16][2];
+uint8_t reserved[128];
+} target_sigregs_ext;
+
 typedef struct {
 abi_ulong oldmask[_SIGCONTEXT_NSIG_WORDS];
 abi_ulong sregs;
@@ -61,15 +67,20 @@ typedef struct {
 target_sigcontext sc;
 target_sigregs sregs;
 int signo;
+target_sigregs_ext sregs_ext;
 uint8_t retcode[S390_SYSCALL_SIZE];
 } sigframe;
 
+#define TARGET_UC_VXRS 2
+
 struct target_ucontext {
 abi_ulong tuc_flags;
 abi_ulong tuc_link;
 target_stack_t tuc_stack;
 target_sigregs tuc_mcontext;
-target_sigset_t tuc_sigmask;   /* mask last for extensibility */
+target_sigset_t tuc_sigmask;
+uint8_t reserved[128 - sizeof(target_sigset_t)];
+target_sigregs_ext tuc_mcontext_ext;
 };
 
 typedef struct {
@@ -129,6 +140,24 @@ static void save_sigregs(CPUS390XState *env, 
target_sigregs *sregs)
 }
 }
 
+static void save_sigregs_ext(CPUS390XState *env, target_sigregs_ext *ext)
+{
+int i;
+
+/*
+ * if (MACHINE_HAS_VX) ...
+ * That said, we always allocate the stack storage and the
+ * space is always available in env.
+ */
+for (i = 0; i < 16; ++i) {
+   __put_user(env->vregs[i][1], &ext->vxrs_low[i]);
+}
+for (i = 0; i < 16; ++i) {
+   __put_user(env->vregs[i + 16][0], &ext->vxrs_high[i][0]);
+   __put_user(env->vregs[i + 16][1], &ext->vxrs_high[i][1]);
+}
+}
+
 void setup_frame(int sig, struct target_sigaction *ka,
  target_sigset_t *set, CPUS390XState *env)
 {
@@ -156,6 +185,9 @@ void setup_frame(int sig, struct target_sigaction *ka,
 /* Place signal number on stack to allow backtrace from handler.  */
 __put_user(sig, &frame->signo);
 
+/* Create sigregs_ext on the signal stack. */
+save_sigregs_ext(env, &frame->sregs_ext);
+
 /*
  * Set up to return from userspace.
  * If provided, use a stub already in userspace.
@@ -196,6 +228,7 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 {
 rt_sigframe *frame;
 abi_ulong frame_addr;
+abi_ulong uc_flags;
 
 frame_addr = get_sigframe(ka, env, sizeof *frame);
 trace_user_setup_rt_frame(env, frame_addr);
@@ -223,10 +256,15 @@ void setup_rt_frame(int sig, struct target_sigaction *ka,
 tswap_siginfo(&frame->info, info);
 
 /* Create the ucontext.  */
-__put_user(0, &frame->uc.tuc_flags);
+uc_flags = 0;
+if (s390_has_feat(S390_FEAT_VECTOR)) {
+uc_flags |= TARGET_UC_VXRS;
+}
+__put_user(uc_flags, &frame->uc.tuc_flags);
 __put_user(0, &frame->uc.tuc_link);
 target_save_altstack(&frame->uc.tuc_stack, env);
 save_sigregs(env, &frame->uc.tuc_mcontext);
+save_sigregs_ext(env, &frame->uc.tuc_mcontext_ext);
 tswap_sigset(&frame->uc.tuc_sigmask, set);
 
 /* Set up registers for signal handler */
@@ -265,6 +303,24 @@ static void restore_sigregs(CPUS390XState *env, 
target_sigregs *sc)
 }
 }
 
+static void restore_sigregs_ext(CPUS390XState *env, target_sigregs_ext *ext)
+{
+int i;
+
+/*
+ * if (MACHINE_HAS_VX) ...
+ * That said, we always allocate the stack storage and the
+ * space is always available in env.
+ */
+for (i = 0; i < 16; ++i) {
+   __get_user(env->vregs[i][1], &ext->vxrs_low[i]);
+}
+for (i = 0; i < 16; ++i) {
+   __get_user(env->vregs[i + 16][0], &ext->vxrs_high[i][0]);
+   __get_user(env->vregs[i + 16][1], &ext->vxrs_high[i][1]);
+}
+}
+
 long do_sigreturn(CPUS390XState *env)
 {
 sigframe *frame;
@@ -286,6 +342,7 @@ long do_sigreturn(CPUS390XState *env)
 set_sigmask(&set); /* ~_BLOCKABLE? */
 
 restore_sigregs(env, &frame->sregs);
+restore_sigregs_ext(env, &frame->sregs_ext);
 
 unlock_user_struct(frame, frame_addr, 0);
 return -TARGET_QEMU_ESIGRETURN;
@@ -308,6 +365,7 @@ long do_rt_sigreturn(CPUS390XState *env)
 
 target_restore_altstack(&frame->uc.tuc_stack, env);
 restore_sigregs(env, &frame->uc.tuc_mcontext);
+restore_sigregs_ext(env, &frame->uc.tuc_mcontext_ext);
 
 unlock_user_struct(frame, frame_addr, 0);
 return -TARGET_QEMU_ESIGRETURN;
-- 
2.25.1




[PATCH 1/3] linux-user/s390x: Fix sigframe types

2021-04-27 Thread Richard Henderson
Noticed via gitlab clang-user job:

  TESTsignals on s390x
../linux-user/s390x/signal.c:258:9: runtime error: \
  1.84467e+19 is outside the range of representable values of \
  type 'unsigned long'

Which points to the fact that we were performing a double-to-uint64_t
conversion while storing the fp registers, instead of just copying
the data across.

Turns out there are several errors:

target_ulong is the size of the target register, whereas abi_ulong
is the target 'unsigned long' type.  Not a big deal here, since we
only support 64-bit s390x, but not correct either.

In target_sigcontext and target ucontext, we used a host pointer
instead of a target pointer, aka abi_ulong.

Signed-off-by: Richard Henderson 
---
 linux-user/s390x/signal.c | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/linux-user/s390x/signal.c b/linux-user/s390x/signal.c
index b68b44ae7e..e5bc4f0358 100644
--- a/linux-user/s390x/signal.c
+++ b/linux-user/s390x/signal.c
@@ -37,13 +37,14 @@
 
 typedef struct {
 target_psw_t psw;
-target_ulong gprs[__NUM_GPRS];
-unsigned int acrs[__NUM_ACRS];
+abi_ulong gprs[__NUM_GPRS];
+abi_uint acrs[__NUM_ACRS];
 } target_s390_regs_common;
 
 typedef struct {
-unsigned int fpc;
-double   fprs[__NUM_FPRS];
+uint32_t fpc;
+uint32_t pad;
+uint64_t fprs[__NUM_FPRS];
 } target_s390_fp_regs;
 
 typedef struct {
@@ -51,22 +52,22 @@ typedef struct {
 target_s390_fp_regs fpregs;
 } target_sigregs;
 
-struct target_sigcontext {
-target_ulong   oldmask[_SIGCONTEXT_NSIG_WORDS];
-target_sigregs *sregs;
-};
+typedef struct {
+abi_ulong oldmask[_SIGCONTEXT_NSIG_WORDS];
+abi_ulong sregs;
+} target_sigcontext;
 
 typedef struct {
 uint8_t callee_used_stack[__SIGNAL_FRAMESIZE];
-struct target_sigcontext sc;
+target_sigcontext sc;
 target_sigregs sregs;
 int signo;
 uint8_t retcode[S390_SYSCALL_SIZE];
 } sigframe;
 
 struct target_ucontext {
-target_ulong tuc_flags;
-struct target_ucontext *tuc_link;
+abi_ulong tuc_flags;
+abi_ulong tuc_link;
 target_stack_t tuc_stack;
 target_sigregs tuc_mcontext;
 target_sigset_t tuc_sigmask;   /* mask last for extensibility */
-- 
2.25.1




[PATCH 0/3] linux-user/s390x: some signal fixes

2021-04-27 Thread Richard Henderson
The first patch fixes a clang sanitize=undefined abort.

The second patch is probably does too much and should be split,
but I'm lazy tonight.  I'll wait for comment before changes.

The third patch is new functionality, which should have gone
in with the s390x vector support.


r~


Richard Henderson (3):
  linux-user/s390x: Fix sigframe types
  linux-user/s390x: Clean up signal.c
  linux-user/s390x: Handle vector regs in signal stack

 linux-user/s390x/signal.c | 265 +++---
 1 file changed, 159 insertions(+), 106 deletions(-)

-- 
2.25.1




RE: [PATCH V6 0/6] Passthrough specific network traffic in COLO

2021-04-27 Thread Zhang, Chen
Please give me for comments for this series, Ping

Thanks
Chen

> -Original Message-
> From: Zhang, Chen 
> Sent: Tuesday, April 20, 2021 11:16 PM
> To: Jason Wang ; qemu-dev  de...@nongnu.org>; Eric Blake ; Dr. David Alan
> Gilbert ; Markus Armbruster ;
> Daniel P. Berrangé ; Gerd Hoffmann
> ; Li Zhijian 
> Cc: Zhang Chen ; Zhang, Chen
> ; Lukas Straub 
> Subject: [PATCH V6 0/6] Passthrough specific network traffic in COLO
> 
> Due to some real user scenarios don't need to monitor all traffic.
> And qemu net-filter also need function to more detailed flow control.
> This series give user ability to passthrough kinds of COLO network stream.
> 
> For example, windows guest user want to enable windows remote desktop
> to touch guest(UDP/TCP 3389), This case use UDP and TCP mixed, and the tcp
> part payload always different caused by real desktop display data(for guest
> time/ mouse display).
> 
> Another case is some real user application will actively transmit information
> include guest time part, primary guest send data with time 10:01.000, At the
> same time secondary guest send data with time 10:01.001, it will always
> trigger COLO checkpoint(live migrate) to drop guest performance.
> 
>   V6:
> - Change QAPI IPFlowSpec protocol from enum to str.
> - Use getprotobyname to handle the protocols.
> - Optimize code in net.
> 
>   V5:
> - Squash original 1-3 QAPI patches together.
> - Rename some data structures to avoid misunderstanding.
> - Reuse InetSocketAddressBase in IPFlowSpec.
> - Add new function in util/qemu-sockets.c to parse
>   InetSocketAddressBase.
> - Update HMP command define to reuse current code.
> - Add more comments.
> 
>   V4:
> - Fix QAPI code conflict for V6.0 merged patches.
> - Note this feature for V6.1.
> 
>   V3:
> - Add COLO passthrough list lock.
> - Add usage demo and more comments.
> 
>   V2:
> - Add the n-tuple support.
> - Add some qapi definitions.
> - Support multi colo-compare objects.
> - Support setup each rules for each objects individually.
> - Clean up COLO compare definition to .h file.
> - Rebase HMP command for stable tree.
> - Add redundant rules check.
> 
> 
> Zhang Chen (6):
>   qapi/net: Add IPFlowSpec and QMP command for COLO passthrough
>   util/qemu-sockets.c: Add inet_parse_base to handle
> InetSocketAddressBase
>   hmp-commands: Add new HMP command for COLO passthrough
>   net/colo-compare: Move data structure and define to .h file.
>   net/colo-compare: Add passthrough list to CompareState
>   net/net.c: Add handler for COLO passthrough connection
> 
>  hmp-commands.hx|  26 +++
>  include/monitor/hmp.h  |   2 +
>  include/qemu/sockets.h |   1 +
>  monitor/hmp-cmds.c |  82 
>  net/colo-compare.c | 162 +++-
>  net/colo-compare.h | 118 +
>  net/net.c  | 166 +
>  qapi/net.json  |  68 +
>  util/qemu-sockets.c|  14 
>  9 files changed, 519 insertions(+), 120 deletions(-)
> 
> --
> 2.25.1




Re: [PATCH] Set the correct env->fpip for x86 float instructions [cleaned]

2021-04-27 Thread Ziqiao Kong
Thanks for your review! I did a full re-read of the Intel Manual about
x87 programming just now and would send another patch to handle
FCS:FIP and FDS:FDP.

Ziqiao

On Wed, Apr 28, 2021 at 1:49 AM Richard Henderson
 wrote:
>
> On 4/16/21 8:34 AM, Ziqiao Kong wrote:
> > +++ b/target/i386/tcg/translate.c
> > @@ -6337,7 +6337,10 @@ static target_ulong disas_insn(DisasContext *s, 
> > CPUState *cpu)
> >   goto unknown_op;
> >   }
> >   }
> > +tcg_gen_movi_tl(s->tmp0, pc_start - s->cs_base);
> > +tcg_gen_st_tl(s->tmp0, cpu_env, offsetof(CPUX86State, fpip));
>
> This placement is wrong because it catches instructions that should not modify
> FIP, like FINIT.
>
> It might be best to set a flag around this case like
>
>bool update_fip;
>
>case 0xd8 .. 0xdf:
>  ...
>  update_fip = true;
>  if (mod != 3) {
>  ...
>  } else {
>  ...
>  }
>  if (update_fip) {
>  ...
>  }
>  break;
>
> and set update_fip to false for the set of insns that either do not update FIP
> or clear it (8.1.8 x87 fpu instruction and data (operand) pointers).
>
> I notice you're not saving FCS to go along with this, at least for
> CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 0.
>
> And if you're going to this trouble, you might want to think about FDP+FDS as
> well.  It should be about the same amount of effort.
>
>
> r~



Re: [PATCH 1/5] hw/ppc/spapr_iommu: Register machine reset handler

2021-04-27 Thread David Gibson
On Tue, Apr 27, 2021 at 11:20:07AM +0200, Philippe Mathieu-Daudé wrote:
> On 4/27/21 3:45 AM, David Gibson wrote:
> > On Sat, Apr 24, 2021 at 06:22:25PM +0200, Philippe Mathieu-Daudé wrote:
> >> The TYPE_SPAPR_TCE_TABLE device is bus-less, thus isn't reset
> >> automatically.  Register a reset handler to get reset with the
> >> machine.
> >>
> >> It doesn't seem to be an issue because it is that way since the
> >> device QDev'ifycation 8 years ago, in commit a83000f5e3f
> >> ("spapr-tce: make sPAPRTCETable a proper device").
> >> Still, correct to have a proper API usage.
> > 
> > So, the reason this works now is that we explicitly call
> > device_reset() on the TCE table from the TCE tables "owner", either a
> > PHB (spapr_phb_reset()) or a VIO device (spapr_vio_quiesce_one()).
> > 
> > I think we want either that, or the register_reset(), not both.
> 
> rtas_quiesce() seems to call a DeviceClass::reset() on the
> children of TYPE_SPAPR_VIO_BUS:
> 
> Abstract TYPE_VIO_SPAPR_DEVICE has the TYPE_SPAPR_VIO_BUS bus_type,
> and registers the spapr_vio_busdev_reset() handler, which calls
> spapr_vio_quiesce_one()...
> 
> So either we already have 2 resets, or the bus is never reset?

There are 2 resets, and this is intentional.  We reset once at machine
reset time, via the bus.  Once a booting OS is done with the firmware
it calls "quiesce" to put all the devices back into a safe state.  The
easiest way to do that is just to invoke their reset callbacks, so
that's what we do.

> The bus is created in spapr_machine_init():
> 
> /* Set up VIO bus */
> spapr->vio_bus = spapr_vio_bus_init();
> 
> TYPE_SPAPR_MACHINE class registers spapr_machine_reset(), which
> manually calls qemu_devices_reset() and spapr_drc_reset_all(),
> but I can't understand if a callee resets vio_bus...
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


[PATCH RESEND v2] i386/cpu: Remove the deprecated cpu model 'Icelake-Client'

2021-04-27 Thread Robert Hoo
As it's been marked deprecated since v5.2, now I think it's time remove it
from code.

Signed-off-by: Robert Hoo 
---
(Sorry, forgot to append changelog in last send.)
Changelog:
v2:
Update removed-features.rst.
Since previously no its deprecation info was recorded in
docs/system/deprecated.rst, nothing to update in it.
---
 docs/system/removed-features.rst |   5 ++
 target/i386/cpu.c| 118 ---
 2 files changed, 5 insertions(+), 118 deletions(-)

diff --git a/docs/system/removed-features.rst b/docs/system/removed-features.rst
index 29e9060..f1b5a16 100644
--- a/docs/system/removed-features.rst
+++ b/docs/system/removed-features.rst
@@ -285,6 +285,11 @@ The RISC-V no MMU cpus have been removed. The two CPUs: 
``rv32imacu-nommu`` and
 ``rv64imacu-nommu`` can no longer be used. Instead the MMU status can be 
specified
 via the CPU ``mmu`` option when using the ``rv32`` or ``rv64`` CPUs.
 
+x86 Icelake-Client CPU (removed in 6.1)
+'''
+
+``Icelake-Client`` cpu can no longer be used. Use ``Icelake-Server`` instead.
+
 System emulator machines
 
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad..75f2ad1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3338,124 +3338,6 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .model_id = "Intel Xeon Processor (Cooperlake)",
 },
 {
-.name = "Icelake-Client",
-.level = 0xd,
-.vendor = CPUID_VENDOR_INTEL,
-.family = 6,
-.model = 126,
-.stepping = 0,
-.features[FEAT_1_EDX] =
-CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
-CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
-CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
-CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
-CPUID_DE | CPUID_FP87,
-.features[FEAT_1_ECX] =
-CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES |
-CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 |
-CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
-CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 |
-CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE |
-CPUID_EXT_PCID | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
-.features[FEAT_8000_0001_EDX] =
-CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_NX |
-CPUID_EXT2_SYSCALL,
-.features[FEAT_8000_0001_ECX] =
-CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH,
-.features[FEAT_8000_0008_EBX] =
-CPUID_8000_0008_EBX_WBNOINVD,
-.features[FEAT_7_0_EBX] =
-CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
-CPUID_7_0_EBX_HLE | CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP |
-CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID |
-CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
-CPUID_7_0_EBX_SMAP,
-.features[FEAT_7_0_ECX] =
-CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU 
|
-CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
-CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
-CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
-CPUID_7_0_ECX_AVX512_VPOPCNTDQ,
-.features[FEAT_7_0_EDX] =
-CPUID_7_0_EDX_SPEC_CTRL | CPUID_7_0_EDX_SPEC_CTRL_SSBD,
-/* Missing: XSAVES (not supported by some Linux versions,
-* including v4.1 to v4.12).
-* KVM doesn't yet expose any XSAVES state save component,
-* and the only one defined in Skylake (processor tracing)
-* probably will block migration anyway.
-*/
-.features[FEAT_XSAVE] =
-CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
-CPUID_XSAVE_XGETBV1,
-.features[FEAT_6_EAX] =
-CPUID_6_EAX_ARAT,
-/* Missing: Mode-based execute control (XS/XU), processor tracing, TSC 
scaling */
-.features[FEAT_VMX_BASIC] = MSR_VMX_BASIC_INS_OUTS |
- MSR_VMX_BASIC_TRUE_CTLS,
-.features[FEAT_VMX_ENTRY_CTLS] = VMX_VM_ENTRY_IA32E_MODE |
- VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | 
VMX_VM_ENTRY_LOAD_IA32_PAT |
- VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS | VMX_VM_ENTRY_LOAD_IA32_EFER,
-.features[FEAT_VMX_EPT_VPID_CAPS] = MSR_VMX_EPT_EXECONLY |
- MSR_VMX_EPT_PAGE_WALK_LENGTH_4 | MSR_VMX_EPT_WB | MSR_VMX_EPT_2MB 
|
- MSR_VMX_EPT_1GB | MSR_VMX_EPT_INVEPT |
- MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT | 
MSR_VMX_EPT_INVEPT_ALL_CONTEXT |
- MSR_VMX_EPT_INVVPID | MSR_VMX_EPT_INVVPID_SINGLE_ADDR |
- MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT | 
MSR_VMX_EPT_INVVPID_ALL_CONTEXT |
- MSR_VMX_EPT_INVVP

[PATCH v2] i386/cpu: Remove the deprecated cpu model 'Icelake-Client'

2021-04-27 Thread Robert Hoo
As it's been marked deprecated since v5.2, now I think it's time remove it
from code.

Signed-off-by: Robert Hoo 
---
P.S.
Since previously no its deprecation info was recorded in
docs/system/deprecated.rst, nothing to update it.
---
 docs/system/removed-features.rst |   5 ++
 target/i386/cpu.c| 118 ---
 2 files changed, 5 insertions(+), 118 deletions(-)

diff --git a/docs/system/removed-features.rst b/docs/system/removed-features.rst
index 29e9060..f1b5a16 100644
--- a/docs/system/removed-features.rst
+++ b/docs/system/removed-features.rst
@@ -285,6 +285,11 @@ The RISC-V no MMU cpus have been removed. The two CPUs: 
``rv32imacu-nommu`` and
 ``rv64imacu-nommu`` can no longer be used. Instead the MMU status can be 
specified
 via the CPU ``mmu`` option when using the ``rv32`` or ``rv64`` CPUs.
 
+x86 Icelake-Client CPU (removed in 6.1)
+'''
+
+``Icelake-Client`` cpu can no longer be used. Use ``Icelake-Server`` instead.
+
 System emulator machines
 
 
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ad99cad..75f2ad1 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3338,124 +3338,6 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .model_id = "Intel Xeon Processor (Cooperlake)",
 },
 {
-.name = "Icelake-Client",
-.level = 0xd,
-.vendor = CPUID_VENDOR_INTEL,
-.family = 6,
-.model = 126,
-.stepping = 0,
-.features[FEAT_1_EDX] =
-CPUID_VME | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR | CPUID_MMX |
-CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV | CPUID_MCA |
-CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC | CPUID_CX8 |
-CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC | CPUID_PSE |
-CPUID_DE | CPUID_FP87,
-.features[FEAT_1_ECX] =
-CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES |
-CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 |
-CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
-CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 |
-CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE |
-CPUID_EXT_PCID | CPUID_EXT_F16C | CPUID_EXT_RDRAND,
-.features[FEAT_8000_0001_EDX] =
-CPUID_EXT2_LM | CPUID_EXT2_RDTSCP | CPUID_EXT2_NX |
-CPUID_EXT2_SYSCALL,
-.features[FEAT_8000_0001_ECX] =
-CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH,
-.features[FEAT_8000_0008_EBX] =
-CPUID_8000_0008_EBX_WBNOINVD,
-.features[FEAT_7_0_EBX] =
-CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 |
-CPUID_7_0_EBX_HLE | CPUID_7_0_EBX_AVX2 | CPUID_7_0_EBX_SMEP |
-CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_INVPCID |
-CPUID_7_0_EBX_RTM | CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX |
-CPUID_7_0_EBX_SMAP,
-.features[FEAT_7_0_ECX] =
-CPUID_7_0_ECX_AVX512_VBMI | CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU 
|
-CPUID_7_0_ECX_AVX512_VBMI2 | CPUID_7_0_ECX_GFNI |
-CPUID_7_0_ECX_VAES | CPUID_7_0_ECX_VPCLMULQDQ |
-CPUID_7_0_ECX_AVX512VNNI | CPUID_7_0_ECX_AVX512BITALG |
-CPUID_7_0_ECX_AVX512_VPOPCNTDQ,
-.features[FEAT_7_0_EDX] =
-CPUID_7_0_EDX_SPEC_CTRL | CPUID_7_0_EDX_SPEC_CTRL_SSBD,
-/* Missing: XSAVES (not supported by some Linux versions,
-* including v4.1 to v4.12).
-* KVM doesn't yet expose any XSAVES state save component,
-* and the only one defined in Skylake (processor tracing)
-* probably will block migration anyway.
-*/
-.features[FEAT_XSAVE] =
-CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XSAVEC |
-CPUID_XSAVE_XGETBV1,
-.features[FEAT_6_EAX] =
-CPUID_6_EAX_ARAT,
-/* Missing: Mode-based execute control (XS/XU), processor tracing, TSC 
scaling */
-.features[FEAT_VMX_BASIC] = MSR_VMX_BASIC_INS_OUTS |
- MSR_VMX_BASIC_TRUE_CTLS,
-.features[FEAT_VMX_ENTRY_CTLS] = VMX_VM_ENTRY_IA32E_MODE |
- VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | 
VMX_VM_ENTRY_LOAD_IA32_PAT |
- VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS | VMX_VM_ENTRY_LOAD_IA32_EFER,
-.features[FEAT_VMX_EPT_VPID_CAPS] = MSR_VMX_EPT_EXECONLY |
- MSR_VMX_EPT_PAGE_WALK_LENGTH_4 | MSR_VMX_EPT_WB | MSR_VMX_EPT_2MB 
|
- MSR_VMX_EPT_1GB | MSR_VMX_EPT_INVEPT |
- MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT | 
MSR_VMX_EPT_INVEPT_ALL_CONTEXT |
- MSR_VMX_EPT_INVVPID | MSR_VMX_EPT_INVVPID_SINGLE_ADDR |
- MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT | 
MSR_VMX_EPT_INVVPID_ALL_CONTEXT |
- MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS | 
MSR_VMX_EPT_AD_BITS,
-.features[FEAT_VMX_EXIT_CTLS] =
-  

Re: [PATCH RFC 0/1] To add HMP interface to dump PCI MSI-X table/PBA

2021-04-27 Thread Jason Wang



在 2021/4/27 下午4:53, Dr. David Alan Gilbert 写道:

* Dongli Zhang (dongli.zh...@oracle.com) wrote:


On 4/22/21 11:01 PM, Jason Wang wrote:

在 2021/4/23 下午12:47, Dongli Zhang 写道:

This is inspired by the discussion with Jason on below patchset.

https://urldefense.com/v3/__https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg09020.html__;!!GqivPVa7Brio!KbGQZW5lq3JZ60k12NuWZ6Th1lT6AwmBTF0pBgoWUKKQ4-2UhdW57PtvXUN5XQnZ2NU$

The new HMP command is introduced to dump the MSI-X table and PBA.

Initially, I was going to add new option to "info pci". However, as the
number of entries is not determined and the output of MSI-X table is much
more similar to the output of hmp_info_tlb()/hmp_info_mem(), this patch
adds interface for only HMP.

The patch is tagged with RFC because I am looking for suggestions on:

1. Is it fine to add new "info msix " command?


I wonder the reason for not simply reusing "info pci"?

The "info pci" will show PCI data for all devices and it does not accept any
argument to print for a specific device.

In addition, the "info pci" relies on qmp_query_pci(), where this patch will not
implement the interface for QMP considering the number of MSI-X entries is not
determined.

Suppose we have 10 NVMe (emulated by QEMU with default number of queues), we
will have about 600+ lines of output.

 From an HMP perspective I'm happy, so:

Acked-by: Dr. David Alan Gilbert 

but since I don't know much about MSI I'd like to see Jason's reply.



I think we'd better have more information, e.g the device can optionally 
report how the MSI-X vector is used.


Virtio-pci could be the first user for this.




Adding an optional option to 'info pci' to limit to one device would be easy
though; that bit is probably easier than adding a new command.



One interesting point is that MSI could be extended for other bus, (e.g 
MMIO). So "info msi" should be better I guess.




Figuring out the QMP representation of your entries might be harder -
and if this is strictly for debug, probably not worth it?



I think so.

Thanks




Dave



Dongli Zhang




2. Is there any issue with output format?


If it's not for QMP, I guess it's not a part of ABI so it should be fine.



3. Is it fine to add only for HMP, but not QMP?


I think so.

Thanks



Thank you very much!

Dongli Zhang








Re: [PATCH] virtio-net: failover: add missing remove_migration_state_change_notifier()

2021-04-27 Thread Jason Wang



在 2021/4/27 下午9:51, Laurent Vivier 写道:

In the failover case configuration, virtio_net_device_realize() uses an
add_migration_state_change_notifier() to add a state notifier, but this
notifier is not removed by the unrealize function when the virtio-net
card is unplugged.

If the card is unplugged and a migration is started, the notifier is
called and as it is not valid anymore QEMU crashes.

This patch fixes the problem by adding the
remove_migration_state_change_notifier() in virtio_net_device_unrealize().

The problem can be reproduced with:

   $ qemu-system-x86_64 -enable-kvm -m 1g -M q35 \
 -device pcie-root-port,slot=4,id=root1 \
 -device pcie-root-port,slot=5,id=root2 \
 -device virtio-net-pci,id=net1,mac=52:54:00:6f:55:cc,failover=on,bus=root1 
\
 -monitor stdio disk.qcow2
   (qemu) device_del net1
   (qemu) migrate "exec:gzip -c > STATEFILE.gz"

   Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
   0x in ?? ()
   (gdb) bt
   #0  0x in  ()
   #1  0x55d726d7 in notifier_list_notify (...)
   at .../util/notify.c:39
   #2  0x55842c1a in migrate_fd_connect (...)
   at .../migration/migration.c:3975
   #3  0x55950f7d in migration_channel_connect (...)
   error@entry=0x0) at .../migration/channel.c:107
   #4  0x55910922 in exec_start_outgoing_migration (...)
   at .../migration/exec.c:42

Reported-by: Igor Mammedov 
Signed-off-by: Laurent Vivier 



Acked-by: Jason Wang 

This should be added to stable I guess.

Thanks



---
  hw/net/virtio-net.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 66b9ff451185..914051feb75b 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -3373,6 +3373,7 @@ static void virtio_net_device_unrealize(DeviceState *dev)
  
  if (n->failover) {

  device_listener_unregister(&n->primary_listener);
+remove_migration_state_change_notifier(&n->migration_state);
  }
  
  max_queues = n->multiqueue ? n->max_queues : 1;





[PATCH v2] floppy: remove dead code related to formatting

2021-04-27 Thread Alexander Bulekov
fdctrl_format_sector was added in
baca51faff ("updated floppy driver: formatting code, disk geometry auto detect 
(Jocelyn Mayer)")

The single callsite is guarded by a check:
fdctrl->data_state & FD_STATE_FORMAT

However, the only place where the FD_STATE_FORMAT flag is set (in
fdctrl_handle_format_track) is closely followed by the same flag being
unset, with no possibility to call fdctrl_format_sector in between.

This removes fdctrl_format_sector, the unncessary setting/unsetting
of the FD_STATE_FORMAT flag, and the fdctrl_handle_format_track function
(which is just a stub).

Suggested-by: Hervé Poussineau 
Signed-off-by: Alexander Bulekov 
---

I ran through tests/qtest/fdc-test, and ran fdformat on a dummy disk -
nothing exploded, but since I don't use floppies very often, more eyes
definitely won't hurt. In particular, I'm not sure about the
fdctrl_handle_format_track delete - that function has side-effects on
both FDrive and FDCtrl, and it is certainly reachable. If deleting the
whole thing seems wrong, I'll roll-back that change, and we can just
remove the unreachable code..

 hw/block/fdc.c | 97 --
 1 file changed, 97 deletions(-)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index a825c2acba..d851d23cc0 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -657,7 +657,6 @@ enum {
 
 enum {
 FD_STATE_MULTI  = 0x01,/* multi track flag */
-FD_STATE_FORMAT = 0x02,/* format flag */
 };
 
 enum {
@@ -826,7 +825,6 @@ enum {
 };
 
 #define FD_MULTI_TRACK(state) ((state) & FD_STATE_MULTI)
-#define FD_FORMAT_CMD(state) ((state) & FD_STATE_FORMAT)
 
 struct FDCtrl {
 MemoryRegion iomem;
@@ -1942,67 +1940,6 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
 return retval;
 }
 
-static void fdctrl_format_sector(FDCtrl *fdctrl)
-{
-FDrive *cur_drv;
-uint8_t kh, kt, ks;
-
-SET_CUR_DRV(fdctrl, fdctrl->fifo[1] & FD_DOR_SELMASK);
-cur_drv = get_cur_drv(fdctrl);
-kt = fdctrl->fifo[6];
-kh = fdctrl->fifo[7];
-ks = fdctrl->fifo[8];
-FLOPPY_DPRINTF("format sector at %d %d %02x %02x (%d)\n",
-   GET_CUR_DRV(fdctrl), kh, kt, ks,
-   fd_sector_calc(kh, kt, ks, cur_drv->last_sect,
-  NUM_SIDES(cur_drv)));
-switch (fd_seek(cur_drv, kh, kt, ks, fdctrl->config & FD_CONFIG_EIS)) {
-case 2:
-/* sect too big */
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM, 0x00, 0x00);
-fdctrl->fifo[3] = kt;
-fdctrl->fifo[4] = kh;
-fdctrl->fifo[5] = ks;
-return;
-case 3:
-/* track too big */
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM, FD_SR1_EC, 0x00);
-fdctrl->fifo[3] = kt;
-fdctrl->fifo[4] = kh;
-fdctrl->fifo[5] = ks;
-return;
-case 4:
-/* No seek enabled */
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM, 0x00, 0x00);
-fdctrl->fifo[3] = kt;
-fdctrl->fifo[4] = kh;
-fdctrl->fifo[5] = ks;
-return;
-case 1:
-fdctrl->status0 |= FD_SR0_SEEK;
-break;
-default:
-break;
-}
-memset(fdctrl->fifo, 0, FD_SECTOR_LEN);
-if (cur_drv->blk == NULL ||
-blk_pwrite(cur_drv->blk, fd_offset(cur_drv), fdctrl->fifo,
-   BDRV_SECTOR_SIZE, 0) < 0) {
-FLOPPY_DPRINTF("error formatting sector %d\n", fd_sector(cur_drv));
-fdctrl_stop_transfer(fdctrl, FD_SR0_ABNTERM | FD_SR0_SEEK, 0x00, 0x00);
-} else {
-if (cur_drv->sect == cur_drv->last_sect) {
-fdctrl->data_state &= ~FD_STATE_FORMAT;
-/* Last sector done */
-fdctrl_stop_transfer(fdctrl, 0x00, 0x00, 0x00);
-} else {
-/* More to do */
-fdctrl->data_pos = 0;
-fdctrl->data_len = 4;
-}
-}
-}
-
 static void fdctrl_handle_lock(FDCtrl *fdctrl, int direction)
 {
 fdctrl->lock = (fdctrl->fifo[0] & 0x80) ? 1 : 0;
@@ -2110,34 +2047,6 @@ static void fdctrl_handle_readid(FDCtrl *fdctrl, int 
direction)
  (NANOSECONDS_PER_SECOND / 50));
 }
 
-static void fdctrl_handle_format_track(FDCtrl *fdctrl, int direction)
-{
-FDrive *cur_drv;
-
-SET_CUR_DRV(fdctrl, fdctrl->fifo[1] & FD_DOR_SELMASK);
-cur_drv = get_cur_drv(fdctrl);
-fdctrl->data_state |= FD_STATE_FORMAT;
-if (fdctrl->fifo[0] & 0x80)
-fdctrl->data_state |= FD_STATE_MULTI;
-else
-fdctrl->data_state &= ~FD_STATE_MULTI;
-cur_drv->bps =
-fdctrl->fifo[2] > 7 ? 16384 : 128 << fdctrl->fifo[2];
-#if 0
-cur_drv->last_sect =
-cur_drv->flags & FDISK_DBL_SIDES ? fdctrl->fifo[3] :
-fdctrl->fifo[3] / 2;
-#else
-cur_drv->last_sect = fdctrl->fifo[3];
-#endif
-/* TODO: implement format using DMA expected by the Bochs BIOS
- * and Linux fdformat (read 3 bytes per sector via DMA and fill
- * the sector with the specified fill byte
- */
-fdctrl->data_state &= ~FD_STATE_FORMAT;
-fdc

Re: [PATCH v6 0/7] eBPF RSS support for virtio-net

2021-04-27 Thread Jason Wang



在 2021/4/27 下午6:47, Andrew Melnichenko 写道:

Hi,
I've checked the issue. Apparently, libbpf can't work with a 
skeleton on Debian.

Version check would not help - versioning differs at different distros.



So you meant the libbpf version is too old?



I've added a small check:

diff --git a/meson.build b/meson.build
index ca551dd15d..4a51a25643 100644
--- a/meson.build
+++ b/meson.build
@@ -1018,6 +1018,20 @@ endif

 # libbpf
 libbpf = dependency('libbpf', required: get_option('bpf'))
+if libbpf.found() and not cc.links('''
+   #include 
+   int main(void)
+   {
+     bpf_object__destroy_skeleton(NULL);
+     return 0;
+   }''', dependencies: libbpf)
+  libbpf = not_found
+  if get_option('bpf').enabled()
+    error('libbpf skeleton test failed')
+  else
+    warning('libbpf skeleton test failed, disabling')
+  endif
+endif

 if get_option('cfi')
   cfi_flags=[]


Is it possible to prepare an additional patch or should I prepare new 
eBPFv7 patches?



Please send V7.

Thanks





On Sun, Apr 25, 2021 at 6:32 AM Jason Wang > wrote:



在 2021/4/12 下午4:25, Andrew Melnychenko 写道:
> This set of patches introduces the usage of eBPF for packet steering
> and RSS hash calculation:
> * RSS(Receive Side Scaling) is used to distribute network packets to
> guest virtqueues by calculating packet hash
> * Additionally adding support for the usage of RSS with vhost
>
> The eBPF works on kernels 5.8+
> On earlier kerneld it fails to load and the RSS feature is reported
> only without vhost and implemented in 'in-qemu' software.
>
> Implementation notes:
> Linux TAP TUNSETSTEERINGEBPF ioctl was used to set the eBPF program.
> Added libbpf dependency and eBPF support.
> The eBPF program is part of the qemu and presented as an array
> of BPF ELF file data. The eBPF array file initially generated by
bpftool.
> The compilation of eBPF is not part of QEMU build and can be done
> using provided Makefile.ebpf.
> Added changes to virtio-net and vhost, primary eBPF RSS is used.
> 'in-qemu' RSS used in the case of hash population and as a
fallback option.
> For vhost, the hash population feature is not reported to the guest.
>
> Please also see the documentation in PATCH 6/7.
>
> Known issues:
> * hash population not supported by eBPF RSS: 'in-qemu' RSS used
> as a fallback, also, hash population feature is not reported to
guests
> with vhost.
> * IPv6 extensions still in progress.


Want to merge but it fails to build on Debian 10.9:

dpkg -l | grep libbpf
ii  libbpf-dev:amd64  4.19.181-1 amd64    eBPF helper
library (development files)
ii  libbpf4.19:amd64  4.19.181-1 amd64    eBPF helper
library (shared library)

I configure use --enable-bpf --target-list=x86_64-softmmu, and I get:

[3/1375] Compiling C object libcommon.fa.p/ebpf_ebpf_rss.c.o
FAILED: libcommon.fa.p/ebpf_ebpf_rss.c.o
cc -Ilibcommon.fa.p -I. -I.. -I../slirp -I../slirp/src
-I../dtc/libfdt
-I../capstone/include/capstone -Iqapi -Itrace -Iui -Iui/shader
-I/usr/include/libmount -I/usr/include/blkid -I/usr/include/uuid
-I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include
-I/usr/include/gio-unix-2.0 -I/usr/include/pixman-1
-fdiagnostics-color=auto -pipe -Wall -Winvalid-pch -Werror -std=gnu99
-O2 -g -isystem /home/devel/git/qemu/linux-headers -isystem
linux-headers -iquote . -iquote /home/devel/git/qemu -iquote
/home/devel/git/qemu/include -iquote
/home/devel/git/qemu/disas/libvixl
-iquote /home/devel/git/qemu/tcg/i386 -iquote
/home/devel/git/qemu/accel/tcg -pthread -U_FORTIFY_SOURCE
-D_FORTIFY_SOURCE=2 -m64 -mcx16 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64
-D_LARGEFILE_SOURCE -Wstrict-prototypes -Wredundant-decls -Wundef
-Wwrite-strings -Wmissing-prototypes -fno-strict-aliasing -fno-common
-fwrapv -Wold-style-declaration -Wold-style-definition -Wtype-limits
-Wformat-security -Wformat-y2k -Winit-self -Wignored-qualifiers
-Wempty-body -Wnested-externs -Wendif-labels -Wexpansion-to-defined
-Wimplicit-fallthrough=2 -Wno-missing-include-dirs
-Wno-shift-negative-value -Wno-psabi -fstack-protector-strong
-fPIC -MD
-MQ libcommon.fa.p/ebpf_ebpf_rss.c.o -MF
libcommon.fa.p/ebpf_ebpf_rss.c.o.d -o
libcommon.fa.p/ebpf_ebpf_rss.c.o
-c ../ebpf/ebpf_rss.c
In file included from ../ebpf/ebpf_rss.c:23:
/home/devel/git/qemu/ebpf/rss.bpf.skeleton.h: In function
‘rss_bpf__destroy’:
/home/devel/git/qemu/ebpf/rss.bpf.skeleton.h:32:3: error: implicit
declaration of function ‘bpf_object__destroy_skeleton’; did you mean
‘bpf_object__kversion’? [-Werror=implicit-function-declaration]
    bpf_object__destroy_skeleton(obj->skeleton);
    ^~

Re: X on old (non-x86) Linux guests

2021-04-27 Thread Andrew Randrianasulu
On Wednesday, April 28, 2021, Andrew Randrianasulu 
wrote:

>
>
> On Monday, April 26, 2021, BALATON Zoltan  wrote:
>
>> Hello,
>>
>> On Mon, 26 Apr 2021, Dr. David Alan Gilbert wrote:
>>
>>>  Over the weekend I got a Red Hat 6.x (not RHEL!) for Alpha booting
>>> under QEMU which was pretty neat.  But I failed to find a succesful
>>> combination to get X working; has anyone any suggestions?
>>>
>>
>> Adding Andrew who has experimented with old X framebuffer so he may
>> remember something more but that was on x86.
>
>
>
> Sorry, I still away from my desktop (with notes/logs), not sure when
> return..
> I do not think I tried something that old.. Kernel 2.2 i guess, before any
> attempt at r128 drm Kernel module was written (in 2.4?) and so before ddx
> attempted to use that (as it tries by default in much newer distros)
>
> I tried Last debian for alpha, (5.0.x?) on qemu, it had bugs in cirrusfb
> in 2.6.26 Kernel so i compiled like 2.6.32.last inside emulated system..
> This made fb works, But still there was no X for me... (can't recall exact
> error - May be even sigabort or sigbus - but do not count on my memory on
> this... /)
>
> Notes i used for launching qemu -
> https://virtuallyfun.com/wordpress/2014/02/19/alpha-linux-on-qemu/
> But Sadly pre-compressed disk image from that post really gone (it uses
> funny error Page telling you to use login/password, yet file can't be
> downloaded...)
>


Upd:
https://web.archive.org/web/20191021110430/https://vpsland.superglobalmegacorp.com/old/install/linux/DecAlpha/alpha-linux.7z

This May give you kernel/initrd/disk image..

>
>
>>  That distro was from around 2000; the challenge is since we don't have
>>> VESA on non-x86, we can't change mode that way, so generic XF86_SVGA
>>> doesn't want to play with any of the devices.
>>>
>>>  I also tried the ati device, but the accelerated mach64 driver
>>> didn't recognise that ID.
>>>
>>
>> The ati-vga partially emulates an ATI Rage128 Pro so it won't work with
>> mach64 driver that is older and while functionally similar has different
>> registers. You probably need to load aty128fb and then set a mode with
>> fbset then may need to edit X conf but I forgot which option was neded,
>> something about UseFb or similar so it won't try to change mode itself but
>> use the already set Linux FB because otherwise it did not detect the card
>> properly but I don'r remember the details so may be wrong. Also some 2D
>> accel is emulated so may work without disabling it but I think has some
>> bugs so it may have glitches.
>>
>>  Has anyone found any combo that works?
>>> I suspect using one of the existing devices, lying about PCI ID, and
>>> then turning off all accelerations might have a chance but I've not got
>>> that far.
>>>
>>
>> Changing the PCI ID may not help as these ATI chips have different
>> registers so only compatible with the right drivers. I've tried to use
>> current ati-vga with a Mac ROM that expects mach64 but it did not work.
>>
>> It may help to add -trace enable="ati*" and maybe also enable some debug
>> defines in ati_int.h to see if it accesses the card at all but with the
>> right driver that works with Rage128Pro it should produce some picture at
>> least in fb console and we could run X with it on x86 before.
>>
>> More info on ati-vga is here: https://osdn.net/projects/qmig
>> a/wiki/SubprojectAti
>>
>> By the way, last time I've experimented with it I've found that mouse
>> pointer getting out of sync and jumping around is probably a result of
>> mouse acceleration on the host is not taken into account when tracking
>> guest pointer position. Is that possible and is there a way to fix it?
>>
>> Regards,
>> BALATON Zoltan
>>
>


Re: [PATCH] i386/cpu: Remove the deprecated cpu model 'Icelake-Client'

2021-04-27 Thread Robert Hoo
On Tue, 2021-04-27 at 16:55 -0400, Eduardo Habkost wrote:
> On Thu, Apr 22, 2021 at 05:42:16PM +0800, Robert Hoo wrote:
> > As it's been marked deprecated since v5.2, now I think it's time
> > remove it
> > from code.
> > 
> > Signed-off-by: Robert Hoo 
> 
> Thanks!  There's only one issue: we need to update
> docs/system/deprecated.rst and docs/system/removed-features.rst
> when removing the CPU model.

OK, going to send v2 soon.
> 




Re: X on old (non-x86) Linux guests

2021-04-27 Thread Andrew Randrianasulu
On Monday, April 26, 2021, BALATON Zoltan  wrote:

> Hello,
>
> On Mon, 26 Apr 2021, Dr. David Alan Gilbert wrote:
>
>>  Over the weekend I got a Red Hat 6.x (not RHEL!) for Alpha booting
>> under QEMU which was pretty neat.  But I failed to find a succesful
>> combination to get X working; has anyone any suggestions?
>>
>
> Adding Andrew who has experimented with old X framebuffer so he may
> remember something more but that was on x86.



Sorry, I still away from my desktop (with notes/logs), not sure when
return..
I do not think I tried something that old.. Kernel 2.2 i guess, before any
attempt at r128 drm Kernel module was written (in 2.4?) and so before ddx
attempted to use that (as it tries by default in much newer distros)

I tried Last debian for alpha, (5.0.x?) on qemu, it had bugs in cirrusfb in
2.6.26 Kernel so i compiled like 2.6.32.last inside emulated system.. This
made fb works, But still there was no X for me... (can't recall exact error
- May be even sigabort or sigbus - but do not count on my memory on this...
/)

Notes i used for launching qemu -
https://virtuallyfun.com/wordpress/2014/02/19/alpha-linux-on-qemu/
But Sadly pre-compressed disk image from that post really gone (it uses
funny error Page telling you to use login/password, yet file can't be
downloaded...)


>  That distro was from around 2000; the challenge is since we don't have
>> VESA on non-x86, we can't change mode that way, so generic XF86_SVGA
>> doesn't want to play with any of the devices.
>>
>>  I also tried the ati device, but the accelerated mach64 driver
>> didn't recognise that ID.
>>
>
> The ati-vga partially emulates an ATI Rage128 Pro so it won't work with
> mach64 driver that is older and while functionally similar has different
> registers. You probably need to load aty128fb and then set a mode with
> fbset then may need to edit X conf but I forgot which option was neded,
> something about UseFb or similar so it won't try to change mode itself but
> use the already set Linux FB because otherwise it did not detect the card
> properly but I don'r remember the details so may be wrong. Also some 2D
> accel is emulated so may work without disabling it but I think has some
> bugs so it may have glitches.
>
>  Has anyone found any combo that works?
>> I suspect using one of the existing devices, lying about PCI ID, and
>> then turning off all accelerations might have a chance but I've not got
>> that far.
>>
>
> Changing the PCI ID may not help as these ATI chips have different
> registers so only compatible with the right drivers. I've tried to use
> current ati-vga with a Mac ROM that expects mach64 but it did not work.
>
> It may help to add -trace enable="ati*" and maybe also enable some debug
> defines in ati_int.h to see if it accesses the card at all but with the
> right driver that works with Rage128Pro it should produce some picture at
> least in fb console and we could run X with it on x86 before.
>
> More info on ati-vga is here: https://osdn.net/projects/qmig
> a/wiki/SubprojectAti
>
> By the way, last time I've experimented with it I've found that mouse
> pointer getting out of sync and jumping around is probably a result of
> mouse acceleration on the host is not taken into account when tracking
> guest pointer position. Is that possible and is there a way to fix it?
>
> Regards,
> BALATON Zoltan
>


Re: [PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread Vladimir Sementsov-Ogievskiy

27.04.2021 21:24, John Snow wrote:

On 4/27/21 7:11 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

It's a simpler alternative for
"[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
   <20200902181831.2570048-1-ebl...@redhat.com>
   https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
   https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/

Since we have "coroutine: true" feature for qmp commands, I think,
maybe we can merge allocation status to bitmap without bothering with
new block-job?

It's an RFC:

1. Main question: is it OK as a simple blocking command, even in a
coroutine mode. It's a lot simpler, and it can be simply used in a
transaction with other bitmap commands.



Hm, possibly... I did not follow the discussion of coroutine QMP commands 
closely to know what the qualifying criteria to use them are.

(Any wisdom for me here, Markus?)


2. Transaction support is not here now. Will add in future version, if
general approach is OK.



That should be alright, I think. It means that the operation needs to succeed 
before the transaction returns success, though.

Depending on what else is in the transaction, do we run the risk of a deadlock 
if we need to wait for a coroutine to finish?


3. I just do bdrv_co_enter() / bdrv_co_leave() like it is done in the
only coroutine qmp command - block_resize(). I'm not sure how much is it
correct.



See above concern!


4. I don't do any "drain". I think it's not needed, as intended usage
is to merge block-status to _active_ bitmap. So all concurrent
operations will just increase dirtyness of the bitmap and it is OK.



That sounds fine for individual usage, but I can't convince myself it's safe 
for transactions.


qmp_transaction do drain itself.. Still, it's a bit strange that it does just 
drain and not drained section around the whole logic.




5. Probably we still need to create some BdrvChild to avoid node resize
during the loop of block-status querying.



I'm less sure that it's OK to cause temporary graph changes during the course 
of a blocking QMP function... but maybe that's OK?

Peter Krempa is the expert to consult on that one.


6. Test is mostly copied from parallels-read-bitmap, I'll refactor it in
next version to avoid copy-paste.

7. Probably patch 01 is better be split into 2-3 patches.

Vladimir Sementsov-Ogievskiy (2):
   qapi: block-dirty-bitmap-merge: support allocation maps
   iotests: add allocation-map-to-bitmap

  qapi/block-core.json  | 31 -
  include/block/block_int.h |  4 ++
  block/dirty-bitmap.c  | 42 
  block/monitor/bitmap-qmp-cmds.c   | 55 +---
  .../tests/allocation-map-to-bitmap    | 64 +++
  .../tests/allocation-map-to-bitmap.out    |  9 +++
  6 files changed, 195 insertions(+), 10 deletions(-)
  create mode 100755 tests/qemu-iotests/tests/allocation-map-to-bitmap
  create mode 100644 tests/qemu-iotests/tests/allocation-map-to-bitmap.out






--
Best regards,
Vladimir



Re: [PATCH 1/2] hw/sparc: Allow building the leon3 machine stand-alone

2021-04-27 Thread Richard Henderson

On 4/27/21 12:26 PM, Philippe Mathieu-Daudé wrote:

When building only the leon3 machine, we get this link failure:

   /usr/bin/ld: target_sparc_win_helper.c.o: in function `cpu_put_psr':
   target/sparc/win_helper.c:91: undefined reference to `cpu_check_irqs'

This is because cpu_check_irqs() is defined in hw/sparc/sun4m.c,
which is only built if the base sun4m machines are built (with
the CONFIG_SUN4M selector).

Fix by moving cpu_check_irqs() out of hw/sparc/sun4m.c and build
it unconditionally.

Signed-off-by: Philippe Mathieu-Daudé
---
  hw/sparc/irq.c   | 61 
  hw/sparc/sun4m.c | 32 ---
  hw/sparc/meson.build |  1 +
  3 files changed, 62 insertions(+), 32 deletions(-)
  create mode 100644 hw/sparc/irq.c


I think this code should be in target/sparc/.  There doesn't seem to be any 
reference to anything outside CPUSPARCState.



r~



Re: [RFC PATCH 2/2] hw/sparc: Allow building without the leon3 machine

2021-04-27 Thread Richard Henderson

On 4/27/21 12:26 PM, Philippe Mathieu-Daudé wrote:

When building without the leon3 machine, we get this link failure:

   /usr/bin/ld: target_sparc_int32_helper.c.o: in function `leon3_irq_manager':
   target/sparc/int32_helper.c:172: undefined reference to `leon3_irq_ack'

This is because the leon3_irq_ack() is declared in hw/sparc/leon3.c,
which is only build when CONFIG_LEON3 is selected.

Fix by moving the leon3_cache_control_int() / leon3_irq_manager()
(which are specific to the leon3 machine) to hw/sparc/leon3.c.
Move the trace events along (but don't rename them).

leon3_irq_ack() is now locally used, declare it static to reduce
its scope.

Signed-off-by: Philippe Mathieu-Daudé
---
RFC: The problem is we have hardware specific code in the
architectural translation code. I wish there was a better
alternative rather than moving this code to hw/sparc/.
---


This one seems dead obvious.  I think this code should have been in 
hw/sparc/leon3.c to begin with.


Reviewed-by: Richard Henderson 

r~



Re: [Bug 1926044] Re: QEMU-user doesn't report HWCAP2_MTE

2021-04-27 Thread Vitaly Buka
Thanks for the quick fix!

On Tue, Apr 27, 2021 at 2:55 PM Richard Henderson <
1926...@bugs.launchpad.net> wrote:

>
> https://patchew.org/QEMU/20210427214108.88503-1-richard.hender...@linaro.org/
>
> This has missed 6.0, but should be acceptable to roll into 6.0.1.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1926044
>
> Title:
>   QEMU-user doesn't report HWCAP2_MTE
>
> Status in QEMU:
>   In Progress
>
> Bug description:
>   Reproducible on ffa090bc56e73e287a63261e70ac02c0970be61a
>
>   Host Debian 5.10.24 x86_64 GNU
>
>   Configured with "configure --disable-system --enable-linux-user
>   --static"
>
>   This one works and prints "OK" as expected:
>   clang tests/tcg/aarch64/mte-3.c -target aarch64-linux-gnu
> -fsanitize=memtag -march=armv8+memtag
>   qemu-aarch64 --cpu max -L /usr/aarch64-linux-gnu ./a.out && echo OK
>
>
>   This one fails and print "0":
>   cat mytest.c
>   #include 
>   #include 
>
>   #ifndef HWCAP2_MTE
>   #define HWCAP2_MTE (1 << 18)
>   #endif
>
>   int main(int ac, char **av)
>   {
>   printf("%d\n", (int)(getauxval(AT_HWCAP2) & HWCAP2_MTE));
>   }
>
>
>   clang mytest.c -target aarch64-linux-gnu  -fsanitize=memtag
> -march=armv8+memtag
>   qemu-aarch64 --cpu max -L /usr/aarch64-linux-gnu ./a.out
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/qemu/+bug/1926044/+subscriptions
>

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926044

Title:
  QEMU-user doesn't report HWCAP2_MTE

Status in QEMU:
  In Progress

Bug description:
  Reproducible on ffa090bc56e73e287a63261e70ac02c0970be61a

  Host Debian 5.10.24 x86_64 GNU

  Configured with "configure --disable-system --enable-linux-user
  --static"

  This one works and prints "OK" as expected:
  clang tests/tcg/aarch64/mte-3.c -target aarch64-linux-gnu  -fsanitize=memtag 
-march=armv8+memtag
  qemu-aarch64 --cpu max -L /usr/aarch64-linux-gnu ./a.out && echo OK

  
  This one fails and print "0":
  cat mytest.c
  #include 
  #include 

  #ifndef HWCAP2_MTE
  #define HWCAP2_MTE (1 << 18)
  #endif

  int main(int ac, char **av)
  {
  printf("%d\n", (int)(getauxval(AT_HWCAP2) & HWCAP2_MTE));
  }

  
  clang mytest.c -target aarch64-linux-gnu  -fsanitize=memtag 
-march=armv8+memtag
  qemu-aarch64 --cpu max -L /usr/aarch64-linux-gnu ./a.out

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926044/+subscriptions



Re: [PATCH v3 14/33] nbd: move connection code from block/nbd to nbd/client-connection

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:52AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We now have bs-independent connection API, which consists of four
> functions:
> 
>   nbd_client_connection_new()
>   nbd_client_connection_unref()
>   nbd_co_establish_connection()
>   nbd_co_establish_connection_cancel()
> 
> Move them to a separate file together with NBDClientConnection
> structure which becomes private to the new API.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/nbd.h |  11 +++
>  block/nbd.c | 187 ---
>  nbd/client-connection.c | 212 
>  nbd/meson.build |   1 +
>  4 files changed, 224 insertions(+), 187 deletions(-)
>  create mode 100644 nbd/client-connection.c
> 
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 5f34d23bb0..57381be76f 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -406,4 +406,15 @@ const char *nbd_info_lookup(uint16_t info);
>  const char *nbd_cmd_lookup(uint16_t info);
>  const char *nbd_err_lookup(int err);
>  
> +/* nbd/client-connection.c */
> +typedef struct NBDClientConnection NBDClientConnection;
> +
> +NBDClientConnection *nbd_client_connection_new(const SocketAddress *saddr);
> +void nbd_client_connection_release(NBDClientConnection *conn);
> +
> +QIOChannelSocket *coroutine_fn
> +nbd_co_establish_connection(NBDClientConnection *conn, Error **errp);
> +
> +void coroutine_fn nbd_co_establish_connection_cancel(NBDClientConnection 
> *conn);
> +
>  #endif
> diff --git a/block/nbd.c b/block/nbd.c
> index 8531d019b2..9bd68dcf10 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -66,24 +66,6 @@ typedef enum NBDClientState {
>  NBD_CLIENT_QUIT
>  } NBDClientState;
>  
> -typedef struct NBDClientConnection {
> -/* Initialization constants */
> -SocketAddress *saddr; /* address to connect to */
> -
> -/*
> - * Result of last attempt. Valid in FAIL and SUCCESS states.
> - * If you want to steal error, don't forget to set pointer to NULL.
> - */
> -QIOChannelSocket *sioc;
> -Error *err;
> -
> -QemuMutex mutex;
> -/* All further fields are protected by mutex */
> -bool running; /* thread is running now */
> -bool detached; /* thread is detached and should cleanup the state */
> -Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
> -} NBDClientConnection;
> -
>  typedef struct BDRVNBDState {
>  QIOChannelSocket *sioc; /* The master data channel */
>  QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
> @@ -118,12 +100,8 @@ typedef struct BDRVNBDState {
>  NBDClientConnection *conn;
>  } BDRVNBDState;
>  
> -static void nbd_client_connection_release(NBDClientConnection *conn);
>  static int nbd_establish_connection(BlockDriverState *bs, SocketAddress 
> *saddr,
>  Error **errp);
> -static coroutine_fn QIOChannelSocket *
> -nbd_co_establish_connection(NBDClientConnection *conn, Error **errp);
> -static void nbd_co_establish_connection_cancel(NBDClientConnection *conn);
>  static int nbd_client_handshake(BlockDriverState *bs, Error **errp);
>  static void nbd_yank(void *opaque);
>  
> @@ -340,171 +318,6 @@ static bool nbd_client_connecting_wait(BDRVNBDState *s)
>  return qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT;
>  }
>  
> -static NBDClientConnection *
> -nbd_client_connection_new(const SocketAddress *saddr)
> -{
> -NBDClientConnection *conn = g_new(NBDClientConnection, 1);
> -
> -*conn = (NBDClientConnection) {
> -.saddr = QAPI_CLONE(SocketAddress, saddr),
> -};
> -
> -qemu_mutex_init(&conn->mutex);
> -
> -return conn;
> -}
> -
> -static void nbd_client_connection_do_free(NBDClientConnection *conn)
> -{
> -if (conn->sioc) {
> -qio_channel_close(QIO_CHANNEL(conn->sioc), NULL);
> -object_unref(OBJECT(conn->sioc));
> -}
> -error_free(conn->err);
> -qapi_free_SocketAddress(conn->saddr);
> -g_free(conn);
> -}
> -
> -static void *connect_thread_func(void *opaque)
> -{
> -NBDClientConnection *conn = opaque;
> -bool do_free;
> -int ret;
> -
> -conn->sioc = qio_channel_socket_new();
> -
> -error_free(conn->err);
> -conn->err = NULL;
> -ret = qio_channel_socket_connect_sync(conn->sioc, conn->saddr, 
> &conn->err);
> -if (ret < 0) {
> -object_unref(OBJECT(conn->sioc));
> -conn->sioc = NULL;
> -}
> -
> -qemu_mutex_lock(&conn->mutex);
> -
> -assert(conn->running);
> -conn->running = false;
> -if (conn->wait_co) {
> -aio_co_schedule(NULL, conn->wait_co);
> -conn->wait_co = NULL;
> -}
> -do_free = conn->detached;
> -
> -qemu_mutex_unlock(&conn->mutex);
> -
> -if (do_free) {
> -nbd_client_connection_do_free(conn);
> -}
> -
> -return NULL;
> -}
> -
> -static void nbd_client_connection_release(NBDClientConnec

Re: [PATCH] microvm: Enable hotplug of pcie

2021-04-27 Thread Michael S. Tsirkin
On Tue, Apr 27, 2021 at 09:04:27PM +0800, suyuheng wrote:
> From: Yuheng Su 
> 
> Signed-off-by: Yuheng Su 


seems to be extended config space as opposed to hotplug ...

> ---
>  hw/i386/acpi-microvm.c | 11 +++
>  1 file changed, 11 insertions(+)
> 
> diff --git a/hw/i386/acpi-microvm.c b/hw/i386/acpi-microvm.c
> index ccd3303aac..4f32bf512f 100644
> --- a/hw/i386/acpi-microvm.c
> +++ b/hw/i386/acpi-microvm.c
> @@ -26,6 +26,7 @@
>  
>  #include "exec/memory.h"
>  #include "hw/acpi/acpi.h"
> +#include "hw/acpi/pci.h"
>  #include "hw/acpi/aml-build.h"
>  #include "hw/acpi/bios-linker-loader.h"
>  #include "hw/acpi/generic_event_device.h"
> @@ -209,6 +210,16 @@ static void acpi_build_microvm(AcpiBuildTables *tables,
>  ACPI_DEVICE_IF(x86ms->acpi_dev), x86ms->oem_id,
>  x86ms->oem_table_id);
>  
> +acpi_add_table(table_offsets, tables_blob);
> +{
> +AcpiMcfgInfo mcfg = {
> +   .base = mms->gpex.ecam.base,
> +   .size = mms->gpex.ecam.size,
> +};
> +build_mcfg(tables_blob, tables->linker, &mcfg, x86ms->oem_id,
> +   x86ms->oem_table_id);
> +}
> +
>  xsdt = tables_blob->len;
>  build_xsdt(tables_blob, tables->linker, table_offsets, x86ms->oem_id,
> x86ms->oem_table_id);
> -- 
> 2.11.0




Re: [PATCH v3 13/33] block/nbd: introduce nbd_client_connection_release()

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:51AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 43 ++-
>  1 file changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index 21a4039359..8531d019b2 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -118,7 +118,7 @@ typedef struct BDRVNBDState {
>  NBDClientConnection *conn;
>  } BDRVNBDState;
>  
> -static void nbd_free_connect_thread(NBDClientConnection *conn);
> +static void nbd_client_connection_release(NBDClientConnection *conn);
>  static int nbd_establish_connection(BlockDriverState *bs, SocketAddress 
> *saddr,
>  Error **errp);
>  static coroutine_fn QIOChannelSocket *
> @@ -130,20 +130,9 @@ static void nbd_yank(void *opaque);
>  static void nbd_clear_bdrvstate(BlockDriverState *bs)
>  {
>  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
> -NBDClientConnection *conn = s->conn;
> -bool do_free;
> -
> -qemu_mutex_lock(&conn->mutex);
> -if (conn->running) {
> -conn->detached = true;
> -}
> -do_free = !conn->running && !conn->detached;
> -qemu_mutex_unlock(&conn->mutex);
>  
> -/* the runaway thread will clean it up itself */
> -if (do_free) {
> -nbd_free_connect_thread(conn);
> -}
> +nbd_client_connection_release(s->conn);
> +s->conn = NULL;
>  
>  yank_unregister_instance(BLOCKDEV_YANK_INSTANCE(bs->node_name));
>  
> @@ -365,7 +354,7 @@ nbd_client_connection_new(const SocketAddress *saddr)
>  return conn;
>  }
>  
> -static void nbd_free_connect_thread(NBDClientConnection *conn)
> +static void nbd_client_connection_do_free(NBDClientConnection *conn)
>  {
>  if (conn->sioc) {
>  qio_channel_close(QIO_CHANNEL(conn->sioc), NULL);
> @@ -379,8 +368,8 @@ static void nbd_free_connect_thread(NBDClientConnection 
> *conn)
>  static void *connect_thread_func(void *opaque)
>  {
>  NBDClientConnection *conn = opaque;
> +bool do_free;
>  int ret;
> -bool do_free = false;
>  

This hunk belongs in patch 8.

Roman.

>  conn->sioc = qio_channel_socket_new();
>  
> @@ -405,12 +394,32 @@ static void *connect_thread_func(void *opaque)
>  qemu_mutex_unlock(&conn->mutex);
>  
>  if (do_free) {
> -nbd_free_connect_thread(conn);
> +nbd_client_connection_do_free(conn);
>  }
>  
>  return NULL;
>  }
>  
> +static void nbd_client_connection_release(NBDClientConnection *conn)
> +{
> +bool do_free;
> +
> +if (!conn) {
> +return;
> +}
> +
> +qemu_mutex_lock(&conn->mutex);
> +if (conn->running) {
> +conn->detached = true;
> +}
> +do_free = !conn->running && !conn->detached;
> +qemu_mutex_unlock(&conn->mutex);
> +
> +if (do_free) {
> +nbd_client_connection_do_free(conn);
> +}
> +}
> +
>  /*
>   * Get a new connection in context of @conn:
>   *   if thread is running, wait for completion
> -- 
> 2.29.2
> 



Re: [PATCH v3 11/33] block/nbd: rename NBDConnectThread to NBDClientConnection

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:49AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We are going to move connection code to own file and want clear names
> and APIs.
> 
> The structure is shared between user and (possibly) several runs of
> connect-thread. So it's wrong to call it "thread". Let's rename to
> something more generic.
> 
> Appropriately rename connect_thread and thr variables to conn.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 137 ++--
>  1 file changed, 68 insertions(+), 69 deletions(-)

Reviewed-by: Roman Kagan 



Re: [PATCH v3 08/33] block/nbd: drop thr->state

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:46AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> We don't need all these states. The code refactored to use two boolean
> variables looks simpler.

Indeed.

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 125 ++--
>  1 file changed, 34 insertions(+), 91 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index e1f39eda6c..2b26a033a4 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -66,24 +66,6 @@ typedef enum NBDClientState {
>  NBD_CLIENT_QUIT
>  } NBDClientState;
>  
> -typedef enum NBDConnectThreadState {
> -/* No thread, no pending results */
> -CONNECT_THREAD_NONE,
> -
> -/* Thread is running, no results for now */
> -CONNECT_THREAD_RUNNING,
> -
> -/*
> - * Thread is running, but requestor exited. Thread should close
> - * the new socket and free the connect state on exit.
> - */
> -CONNECT_THREAD_RUNNING_DETACHED,
> -
> -/* Thread finished, results are stored in a state */
> -CONNECT_THREAD_FAIL,
> -CONNECT_THREAD_SUCCESS
> -} NBDConnectThreadState;
> -
>  typedef struct NBDConnectThread {
>  /* Initialization constants */
>  SocketAddress *saddr; /* address to connect to */
> @@ -97,7 +79,8 @@ typedef struct NBDConnectThread {
>  
>  QemuMutex mutex;
>  /* All further fields are protected by mutex */
> -NBDConnectThreadState state; /* current state of the thread */
> +bool running; /* thread is running now */
> +bool detached; /* thread is detached and should cleanup the state */
>  Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
>  } NBDConnectThread;
>  
> @@ -147,17 +130,17 @@ static void nbd_clear_bdrvstate(BlockDriverState *bs)
>  {
>  BDRVNBDState *s = (BDRVNBDState *)bs->opaque;
>  NBDConnectThread *thr = s->connect_thread;
> -bool thr_running;
> +bool do_free;
>  
>  qemu_mutex_lock(&thr->mutex);
> -thr_running = thr->state == CONNECT_THREAD_RUNNING;
> -if (thr_running) {
> -thr->state = CONNECT_THREAD_RUNNING_DETACHED;
> +if (thr->running) {
> +thr->detached = true;
>  }
> +do_free = !thr->running && !thr->detached;

This is redundant.  You can unconditionally set ->detached and only
depend on ->running for the rest of this function.

>  qemu_mutex_unlock(&thr->mutex);
>  
>  /* the runaway thread will clean it up itself */
> -if (!thr_running) {
> +if (do_free) {
>  nbd_free_connect_thread(thr);
>  }
>  
> @@ -373,7 +356,6 @@ static void nbd_init_connect_thread(BDRVNBDState *s)
>  
>  *s->connect_thread = (NBDConnectThread) {
>  .saddr = QAPI_CLONE(SocketAddress, s->saddr),
> -.state = CONNECT_THREAD_NONE,
>  };
>  
>  qemu_mutex_init(&s->connect_thread->mutex);
> @@ -408,20 +390,13 @@ static void *connect_thread_func(void *opaque)
>  
>  qemu_mutex_lock(&thr->mutex);
>  
> -switch (thr->state) {
> -case CONNECT_THREAD_RUNNING:
> -thr->state = ret < 0 ? CONNECT_THREAD_FAIL : CONNECT_THREAD_SUCCESS;
> -if (thr->wait_co) {
> -aio_co_schedule(NULL, thr->wait_co);
> -thr->wait_co = NULL;
> -}
> -break;
> -case CONNECT_THREAD_RUNNING_DETACHED:
> -do_free = true;
> -break;
> -default:
> -abort();
> +assert(thr->running);
> +thr->running = false;
> +if (thr->wait_co) {
> +aio_co_schedule(NULL, thr->wait_co);
> +thr->wait_co = NULL;
>  }
> +do_free = thr->detached;
>  
>  qemu_mutex_unlock(&thr->mutex);
>  
> @@ -435,36 +410,24 @@ static void *connect_thread_func(void *opaque)
>  static int coroutine_fn
>  nbd_co_establish_connection(BlockDriverState *bs, Error **errp)
>  {
> -int ret;
>  QemuThread thread;
>  BDRVNBDState *s = bs->opaque;
>  NBDConnectThread *thr = s->connect_thread;
>  
> +assert(!s->sioc);
> +
>  qemu_mutex_lock(&thr->mutex);
>  
> -switch (thr->state) {
> -case CONNECT_THREAD_FAIL:
> -case CONNECT_THREAD_NONE:
> +if (!thr->running) {
> +if (thr->sioc) {
> +/* Previous attempt finally succeeded in background */
> +goto out;
> +}
> +thr->running = true;
>  error_free(thr->err);
>  thr->err = NULL;
> -thr->state = CONNECT_THREAD_RUNNING;
>  qemu_thread_create(&thread, "nbd-connect",
> connect_thread_func, thr, QEMU_THREAD_DETACHED);
> -break;
> -case CONNECT_THREAD_SUCCESS:
> -/* Previous attempt finally succeeded in background */
> -thr->state = CONNECT_THREAD_NONE;
> -s->sioc = thr->sioc;
> -thr->sioc = NULL;
> -yank_register_function(BLOCKDEV_YANK_INSTANCE(bs->node_name),
> -   nbd_yank, bs);
> -qemu_mutex_unlock(&thr->mutex);
> -return 0;
> -case CONNECT_THREAD_RUNNIN

Re: [PATCH v8 0/6] RISC-V Pointer Masking implementation

2021-04-27 Thread no-reply
Patchew URL: 
https://patchew.org/QEMU/20210427220615.12763-1-space.monkey.deliv...@gmail.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20210427220615.12763-1-space.monkey.deliv...@gmail.com
Subject: [PATCH v8 0/6] RISC-V Pointer Masking implementation

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] 
patchew/20210427220615.12763-1-space.monkey.deliv...@gmail.com -> 
patchew/20210427220615.12763-1-space.monkey.deliv...@gmail.com
Switched to a new branch 'test'
0d3c038 Allow experimental J-ext to be turned on
2e8b023 Implement address masking functions required for RISC-V Pointer Masking 
extension
9cf6cf0 Support pointer masking for RISC-V for i/c/f/d/a types of instructions
583ea8a Print new PM CSRs in QEMU logs
d674a6d Support CSRs required for RISC-V PM extension except for the h-mode
8f777b4 Add J-extension into RISC-V

=== OUTPUT BEGIN ===
1/6 Checking commit 8f777b425749 (Add J-extension into RISC-V)
2/6 Checking commit d674a6dc8388 (Support CSRs required for RISC-V PM extension 
except for the h-mode)
ERROR: open brace '{' following function declarations go on the next line
#151: FILE: target/riscv/csr.c:193:
+static int pointer_masking(CPURISCVState *env, int csrno) {

WARNING: line over 80 characters
#386: FILE: target/riscv/csr.c:1724:
+[CSR_UMTE]={ "umte",pointer_masking, read_umte,write_umte  
  },

WARNING: line over 80 characters
#387: FILE: target/riscv/csr.c:1725:
+[CSR_UPMMASK] ={ "upmmask", pointer_masking, read_upmmask, 
write_upmmask },

WARNING: line over 80 characters
#388: FILE: target/riscv/csr.c:1726:
+[CSR_UPMBASE] ={ "upmbase", pointer_masking, read_upmbase, 
write_upmbase },

WARNING: line over 80 characters
#390: FILE: target/riscv/csr.c:1728:
+[CSR_MMTE]={ "mmte",pointer_masking, read_mmte,write_mmte  
  },

WARNING: line over 80 characters
#391: FILE: target/riscv/csr.c:1729:
+[CSR_MPMMASK] ={ "mpmmask", pointer_masking, read_mpmmask, 
write_mpmmask },

WARNING: line over 80 characters
#392: FILE: target/riscv/csr.c:1730:
+[CSR_MPMBASE] ={ "mpmbase", pointer_masking, read_mpmbase, 
write_mpmbase },

WARNING: line over 80 characters
#394: FILE: target/riscv/csr.c:1732:
+[CSR_SMTE]={ "smte",pointer_masking, read_smte,write_smte  
  },

WARNING: line over 80 characters
#395: FILE: target/riscv/csr.c:1733:
+[CSR_SPMMASK] ={ "spmmask", pointer_masking, read_spmmask, 
write_spmmask },

WARNING: line over 80 characters
#396: FILE: target/riscv/csr.c:1734:
+[CSR_SPMBASE] ={ "spmbase", pointer_masking, read_spmbase, 
write_spmbase },

total: 1 errors, 9 warnings, 362 lines checked

Patch 2/6 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

3/6 Checking commit 583ea8a026a6 (Print new PM CSRs in QEMU logs)
4/6 Checking commit 9cf6cf0a67f9 (Support pointer masking for RISC-V for 
i/c/f/d/a types of instructions)
5/6 Checking commit 2e8b02377249 (Implement address masking functions required 
for RISC-V Pointer Masking extension)
6/6 Checking commit 0d3c0387b1bb (Allow experimental J-ext to be turned on)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20210427220615.12763-1-space.monkey.deliv...@gmail.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [RFC PATCH] target/mips: Allow building without Inter-Thread Communication hardware

2021-04-27 Thread Richard Henderson

On 4/27/21 12:11 PM, Philippe Mathieu-Daudé wrote:

The Inter-Thread Communication unit (TYPE_MIPS_ITU) is an optional
device that is only selected by a few machines. However it goes
deep into the translation code, as the MTC0/MTHC0 SAAR helpers
call itc_reconfigure().

When building with no machine selecting the ITU component (which
is implemented in hw/misc/mips_itu.c), we get the following link
failure:

   /usr/bin/ld: target_mips_cp0_helper.c.o: in function `helper_mtc0_saar':
   target/mips/cp0_helper.c:1118: undefined reference to `itc_reconfigure'
   /usr/bin/ld: target_mips_cp0_helper.c.o: in function `helper_mthc0_saar':
   target/mips/cp0_helper.c:1135: undefined reference to `itc_reconfigure'

Fix by adding a stub, built when the ITU isn't selected.

Signed-off-by: Philippe Mathieu-Daudé
---
RFC because too much Meson machinery to my taste.
But how to deal with such architectural devices else?

To reproduce:

$ echo CONFIG_JAZZ=y > default-configs/devices/mips64el-softmmu.mak
$ echo CONFIG_SEMIHOSTING=y >> default-configs/devices/mips64el-softmmu.mak
$ configure --without-default-devices
$ ninja qemu-system-mips64el
$ ./qemu-system-mips64el -M magnum -S
---
  target/mips/cp0_itu-stub.c | 15 +++
  target/mips/meson.build|  3 +++
  2 files changed, 18 insertions(+)
  create mode 100644 target/mips/cp0_itu-stub.c


Perhaps use __attribute__((weak)) on itc_reconfigure?  Then you don't need the 
stub at all.  You're already protecting the actual call, so there should be no 
change needed there.


We're not using weak so far, but as far as I can tell this is supported by gcc 
on windows as well.



r~



[PATCH v8 2/6] [RISCV_PM] Support CSRs required for RISC-V PM extension except for the h-mode

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
---
 target/riscv/cpu.c  |   5 +
 target/riscv/cpu.h  |  12 ++
 target/riscv/cpu_bits.h |  66 +++
 target/riscv/csr.c  | 239 
 4 files changed, 322 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 7d6ed80f6b..c04911ec05 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -473,6 +473,11 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 if (cpu->cfg.ext_h) {
 target_misa |= RVH;
 }
+if (cpu->cfg.ext_j) {
+#ifndef CONFIG_USER_ONLY
+env->mmte |= PM_EXT_INITIAL;
+#endif
+}
 if (cpu->cfg.ext_v) {
 target_misa |= RVV;
 if (!is_power_of_2(cpu->cfg.vlen)) {
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0ea9fc65c8..19aa3b4769 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -238,6 +238,18 @@ struct CPURISCVState {
 
 /* True if in debugger mode.  */
 bool debugger;
+
+/*
+ * CSRs for PM
+ * TODO: move these csr to appropriate groups
+ */
+target_ulong mmte;
+target_ulong mpmmask;
+target_ulong mpmbase;
+target_ulong spmmask;
+target_ulong spmbase;
+target_ulong upmmask;
+target_ulong upmbase;
 #endif
 
 float_status fp_status;
diff --git a/target/riscv/cpu_bits.h b/target/riscv/cpu_bits.h
index caf4599207..f8e7cdb99b 100644
--- a/target/riscv/cpu_bits.h
+++ b/target/riscv/cpu_bits.h
@@ -354,6 +354,21 @@
 #define CSR_MHPMCOUNTER30H  0xb9e
 #define CSR_MHPMCOUNTER31H  0xb9f
 
+/* Custom user register */
+#define CSR_UMTE0x8c0
+#define CSR_UPMMASK 0x8c1
+#define CSR_UPMBASE 0x8c2
+
+/* Custom machine register */
+#define CSR_MMTE0x7c0
+#define CSR_MPMMASK 0x7c1
+#define CSR_MPMBASE 0x7c2
+
+/* Custom supervisor register */
+#define CSR_SMTE0x9c0
+#define CSR_SPMMASK 0x9c1
+#define CSR_SPMBASE 0x9c2
+
 /* Legacy Machine Protection and Translation (priv v1.9.1) */
 #define CSR_MBASE   0x380
 #define CSR_MBOUND  0x381
@@ -592,4 +607,55 @@
 #define MIE_UTIE   (1 << IRQ_U_TIMER)
 #define MIE_SSIE   (1 << IRQ_S_SOFT)
 #define MIE_USIE   (1 << IRQ_U_SOFT)
+
+/* general mte CSR bits*/
+#define PM_ENABLE   0x0001ULL
+#define PM_CURRENT  0x0002ULL
+#define PM_XS_MASK  0x0003ULL
+
+/* PM XS bits values */
+#define PM_EXT_DISABLE  0xULL
+#define PM_EXT_INITIAL  0x0001ULL
+#define PM_EXT_CLEAN0x0002ULL
+#define PM_EXT_DIRTY0x0003ULL
+
+/* offsets for every pair of control bits per each priv level */
+#define XS_OFFSET0ULL
+#define U_OFFSET 2ULL
+#define S_OFFSET 4ULL
+#define M_OFFSET 6ULL
+
+#define PM_XS_BITS   (PM_XS_MASK << XS_OFFSET)
+#define U_PM_ENABLE  (PM_ENABLE  << U_OFFSET)
+#define U_PM_CURRENT (PM_CURRENT << U_OFFSET)
+#define S_PM_ENABLE  (PM_ENABLE  << S_OFFSET)
+#define S_PM_CURRENT (PM_CURRENT << S_OFFSET)
+#define M_PM_ENABLE  (PM_ENABLE  << M_OFFSET)
+
+/* mmte CSR bits */
+#define MMTE_PM_XS_BITS PM_XS_BITS
+#define MMTE_U_PM_ENABLEU_PM_ENABLE
+#define MMTE_U_PM_CURRENT   U_PM_CURRENT
+#define MMTE_S_PM_ENABLES_PM_ENABLE
+#define MMTE_S_PM_CURRENT   S_PM_CURRENT
+#define MMTE_M_PM_ENABLEM_PM_ENABLE
+#define MMTE_MASK   (MMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT | \
+ MMTE_S_PM_ENABLE | MMTE_S_PM_CURRENT | \
+ MMTE_M_PM_ENABLE | MMTE_PM_XS_BITS)
+
+/* smte CSR bits */
+#define SMTE_PM_XS_BITS PM_XS_BITS
+#define SMTE_U_PM_ENABLEU_PM_ENABLE
+#define SMTE_U_PM_CURRENT   U_PM_CURRENT
+#define SMTE_S_PM_ENABLES_PM_ENABLE
+#define SMTE_S_PM_CURRENT   S_PM_CURRENT
+#define SMTE_MASK   (SMTE_U_PM_ENABLE | SMTE_U_PM_CURRENT | \
+ SMTE_S_PM_ENABLE | SMTE_S_PM_CURRENT | \
+ SMTE_PM_XS_BITS)
+
+/* umte CSR bits */
+#define UMTE_U_PM_ENABLEU_PM_ENABLE
+#define UMTE_U_PM_CURRENT   U_PM_CURRENT
+#define UMTE_MASK   (UMTE_U_PM_ENABLE | MMTE_U_PM_CURRENT)
+
 #endif
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index d2585395bf..829e043ef9 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -184,6 +184,38 @@ static int hmode32(CPURISCVState *env, int csrno)
 
 }
 
+static int umode(CPURISCVState *env, int csrno)
+{
+return -!riscv_has_ext(env, RVU);
+}
+
+/* Checks if PointerMasking registers could be accessed */
+static int pointer_masking(CPURISCVState *env, int csrno) {
+/* Check if j-ext is present */
+int j_check = -!riscv_has_ext(env, RVJ);
+int mode_check = 0;
+int csr_priv = get_field(csrno, 0x300);
+/* check if particular mode is present */
+switch (csr_priv) {
+case PRV_M:
+mode_check = any(env, csrno);
+break;
+case PRV_S:
+mode_check = smode(env, 

[PATCH v8 0/6] RISC-V Pointer Masking implementation

2021-04-27 Thread Alexey Baturo
v8:
Hi folks,

Finally we were able to assign v0.1 draft for Pointer Masking extension for 
RISC-V: 
https://github.com/riscv/riscv-j-extension/blob/master/pointer-masking-proposal.adoc
This is supposed to be the first series of patches with initial support for PM. 
It obviously misses support for hypervisor mode, vector load/stores and some 
other features, while using temporary csr numbers(they're to be assigned by the 
committee a bit later).
With this patch series we were able to run a bunch of tests with HWASAN checks 
enabled.

I hope I've managed to addressed @Alistair's previous comments in this version.

Thanks!

v7:
Hi folks,

Sorry it took me almost 3 month to provide the reply and fixes: it was a really 
busy EOY.
This series contains fixed @Alistair suggestion on enabling J-ext.

As for @Richard comments:
- Indeed I've missed appending review-by to the approved commits. Now I've 
restored them except for the fourth commit. @Richard could you please tell if 
you think it's still ok to commit it as is, or should I support masking mem ops 
for RVV first?
- These patches don't have any support for load/store masking for RVV and RVH 
extensions, so no support for special load/store for Hypervisor in particular.

If this patch series would be accepted, I think my further attention would be 
to:
- Support pm for memory operations for RVV
- Add proper csr and support pm for memory operations for Hypervisor mode
- Support address wrapping on unaligned accesses as @Richard mentioned 
previously

Thanks!

Alexey Baturo (5):
  [RISCV_PM] Add J-extension into RISC-V
  [RISCV_PM] Support CSRs required for RISC-V PM extension except for
the h-mode
  [RISCV_PM] Print new PM CSRs in QEMU logs
  [RISCV_PM] Support pointer masking for RISC-V for i/c/f/d/a types of
instructions
  [RISCV_PM] Allow experimental J-ext to be turned on

Anatoly Parshintsev (1):
  [RISCV_PM] Implement address masking functions required for RISC-V
Pointer Masking extension

 target/riscv/cpu.c  |  32 
 target/riscv/cpu.h  |  34 
 target/riscv/cpu_bits.h |  66 +++
 target/riscv/csr.c  | 236 
 target/riscv/insn_trans/trans_rva.c.inc |   3 +
 target/riscv/insn_trans/trans_rvd.c.inc |   2 +
 target/riscv/insn_trans/trans_rvf.c.inc |   2 +
 target/riscv/insn_trans/trans_rvi.c.inc |   2 +
 target/riscv/translate.c|  42 +
 9 files changed, 419 insertions(+)

-- 
2.20.1




[PATCH v8 6/6] [RISCV_PM] Allow experimental J-ext to be turned on

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 0682410f5d..fecc64d7ba 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -502,6 +502,7 @@ static void riscv_cpu_realize(DeviceState *dev, Error 
**errp)
 #ifndef CONFIG_USER_ONLY
 env->mmte |= PM_EXT_INITIAL;
 #endif
+target_misa |= RVJ;
 }
 if (cpu->cfg.ext_v) {
 target_misa |= RVV;
@@ -574,6 +575,7 @@ static Property riscv_cpu_properties[] = {
 DEFINE_PROP_BOOL("u", RISCVCPU, cfg.ext_u, true),
 /* This is experimental so mark with 'x-' */
 DEFINE_PROP_BOOL("x-h", RISCVCPU, cfg.ext_h, false),
+DEFINE_PROP_BOOL("x-j", RISCVCPU, cfg.ext_j, false),
 DEFINE_PROP_BOOL("x-v", RISCVCPU, cfg.ext_v, false),
 DEFINE_PROP_BOOL("Counters", RISCVCPU, cfg.ext_counters, true),
 DEFINE_PROP_BOOL("Zifencei", RISCVCPU, cfg.ext_ifencei, true),
-- 
2.20.1




[PATCH v8 5/6] [RISCV_PM] Implement address masking functions required for RISC-V Pointer Masking extension

2021-04-27 Thread Alexey Baturo
From: Anatoly Parshintsev 

Signed-off-by: Anatoly Parshintsev 
Reviewed-by: Richard Henderson 
---
 target/riscv/cpu.h   | 20 
 target/riscv/translate.c | 36 ++--
 2 files changed, 54 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 19aa3b4769..2edfc59712 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -407,6 +407,8 @@ FIELD(TB_FLAGS, SEW, 5, 3)
 FIELD(TB_FLAGS, VILL, 8, 1)
 /* Is a Hypervisor instruction load/store allowed? */
 FIELD(TB_FLAGS, HLSX, 9, 1)
+/* If PointerMasking should be applied */
+FIELD(TB_FLAGS, PM_ENABLED, 10, 1)
 
 bool riscv_cpu_is_32bit(CPURISCVState *env);
 
@@ -464,6 +466,24 @@ static inline void cpu_get_tb_cpu_state(CPURISCVState 
*env, target_ulong *pc,
 flags = FIELD_DP32(flags, TB_FLAGS, HLSX, 1);
 }
 }
+if (riscv_has_ext(env, RVJ)) {
+int priv = cpu_mmu_index(env, false);
+bool pm_enabled = false;
+switch (priv) {
+case PRV_U:
+pm_enabled = env->mmte & U_PM_ENABLE;
+break;
+case PRV_S:
+pm_enabled = env->mmte & S_PM_ENABLE;
+break;
+case PRV_M:
+pm_enabled = env->mmte & M_PM_ENABLE;
+break;
+default:
+g_assert_not_reached();
+}
+flags = FIELD_DP32(flags, TB_FLAGS, PM_ENABLED, pm_enabled);
+}
 #endif
 
 *pflags = flags;
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 2e815a5912..37706d56d5 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -36,6 +36,9 @@ static TCGv cpu_gpr[32], cpu_pc, cpu_vl;
 static TCGv_i64 cpu_fpr[32]; /* assume F and D extensions */
 static TCGv load_res;
 static TCGv load_val;
+/* globals for PM CSRs */
+static TCGv pm_mask[4];
+static TCGv pm_base[4];
 
 #include "exec/gen-icount.h"
 
@@ -64,6 +67,10 @@ typedef struct DisasContext {
 uint16_t vlen;
 uint16_t mlen;
 bool vl_eq_vlmax;
+/* PointerMasking extension */
+bool pm_enabled;
+TCGv pm_mask;
+TCGv pm_base;
 CPUState *cs;
 } DisasContext;
 
@@ -90,13 +97,19 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
 }
 
 /*
- * Temp stub: generates address adjustment for PointerMasking
+ * Generates address adjustment for PointerMasking
  */
 static void gen_pm_adjust_address(DisasContext *s,
   TCGv_i64  dst,
   TCGv_i64  src)
 {
-tcg_gen_mov_i64(dst, src);
+if (!s->pm_enabled) {
+/* Load unmodified address */
+tcg_gen_mov_i64(dst, src);
+} else {
+tcg_gen_andc_i64(dst, src, s->pm_mask);
+tcg_gen_or_i64(dst, dst, s->pm_base);
+}
 }
 
 /*
@@ -657,6 +670,10 @@ static void riscv_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cs)
 ctx->lmul = FIELD_EX32(tb_flags, TB_FLAGS, LMUL);
 ctx->mlen = 1 << (ctx->sew  + 3 - ctx->lmul);
 ctx->vl_eq_vlmax = FIELD_EX32(tb_flags, TB_FLAGS, VL_EQ_VLMAX);
+ctx->pm_enabled = FIELD_EX32(tb_flags, TB_FLAGS, PM_ENABLED);
+int priv = cpu_mmu_index(env, false) & TB_FLAGS_PRIV_MMU_MASK;
+ctx->pm_mask = pm_mask[priv];
+ctx->pm_base = pm_base[priv];
 ctx->cs = cs;
 }
 
@@ -777,4 +794,19 @@ void riscv_translate_init(void)
  "load_res");
 load_val = tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, load_val),
  "load_val");
+#ifndef CONFIG_USER_ONLY
+/* Assign PM CSRs to tcg globals */
+pm_mask[PRV_U] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, upmmask), "upmmask");
+pm_base[PRV_U] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, upmbase), "upmbase");
+pm_mask[PRV_S] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, spmmask), "spmmask");
+pm_base[PRV_S] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, spmbase), "spmbase");
+pm_mask[PRV_M] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, mpmmask), "mpmmask");
+pm_base[PRV_M] =
+  tcg_global_mem_new(cpu_env, offsetof(CPURISCVState, mpmbase), "mpmbase");
+#endif
 }
-- 
2.20.1




[PATCH v8 4/6] [RISCV_PM] Support pointer masking for RISC-V for i/c/f/d/a types of instructions

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/insn_trans/trans_rva.c.inc |  3 +++
 target/riscv/insn_trans/trans_rvd.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvf.c.inc |  2 ++
 target/riscv/insn_trans/trans_rvi.c.inc |  2 ++
 target/riscv/translate.c| 10 ++
 5 files changed, 19 insertions(+)

diff --git a/target/riscv/insn_trans/trans_rva.c.inc 
b/target/riscv/insn_trans/trans_rva.c.inc
index be8a9f06dd..5559e347ba 100644
--- a/target/riscv/insn_trans/trans_rva.c.inc
+++ b/target/riscv/insn_trans/trans_rva.c.inc
@@ -26,6 +26,7 @@ static inline bool gen_lr(DisasContext *ctx, arg_atomic *a, 
MemOp mop)
 if (a->rl) {
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
 }
+gen_pm_adjust_address(ctx, src1, src1);
 tcg_gen_qemu_ld_tl(load_val, src1, ctx->mem_idx, mop);
 if (a->aq) {
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_LDAQ);
@@ -46,6 +47,7 @@ static inline bool gen_sc(DisasContext *ctx, arg_atomic *a, 
MemOp mop)
 TCGLabel *l2 = gen_new_label();
 
 gen_get_gpr(src1, a->rs1);
+gen_pm_adjust_address(ctx, src1, src1);
 tcg_gen_brcond_tl(TCG_COND_NE, load_res, src1, l1);
 
 gen_get_gpr(src2, a->rs2);
@@ -91,6 +93,7 @@ static bool gen_amo(DisasContext *ctx, arg_atomic *a,
 gen_get_gpr(src1, a->rs1);
 gen_get_gpr(src2, a->rs2);
 
+gen_pm_adjust_address(ctx, src1, src1);
 (*func)(src2, src1, src2, ctx->mem_idx, mop);
 
 gen_set_gpr(a->rd, src2);
diff --git a/target/riscv/insn_trans/trans_rvd.c.inc 
b/target/riscv/insn_trans/trans_rvd.c.inc
index 4f832637fa..935342f66d 100644
--- a/target/riscv/insn_trans/trans_rvd.c.inc
+++ b/target/riscv/insn_trans/trans_rvd.c.inc
@@ -25,6 +25,7 @@ static bool trans_fld(DisasContext *ctx, arg_fld *a)
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEQ);
 
@@ -40,6 +41,7 @@ static bool trans_fsd(DisasContext *ctx, arg_fsd *a)
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEQ);
 
diff --git a/target/riscv/insn_trans/trans_rvf.c.inc 
b/target/riscv/insn_trans/trans_rvf.c.inc
index 3dfec8211d..04b3c3eb3d 100644
--- a/target/riscv/insn_trans/trans_rvf.c.inc
+++ b/target/riscv/insn_trans/trans_rvf.c.inc
@@ -30,6 +30,7 @@ static bool trans_flw(DisasContext *ctx, arg_flw *a)
 TCGv t0 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_ld_i64(cpu_fpr[a->rd], t0, ctx->mem_idx, MO_TEUL);
 gen_nanbox_s(cpu_fpr[a->rd], cpu_fpr[a->rd]);
@@ -47,6 +48,7 @@ static bool trans_fsw(DisasContext *ctx, arg_fsw *a)
 gen_get_gpr(t0, a->rs1);
 
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_st_i64(cpu_fpr[a->rs2], t0, ctx->mem_idx, MO_TEUL);
 
diff --git a/target/riscv/insn_trans/trans_rvi.c.inc 
b/target/riscv/insn_trans/trans_rvi.c.inc
index d04ca0394c..bee7f6be46 100644
--- a/target/riscv/insn_trans/trans_rvi.c.inc
+++ b/target/riscv/insn_trans/trans_rvi.c.inc
@@ -141,6 +141,7 @@ static bool gen_load(DisasContext *ctx, arg_lb *a, MemOp 
memop)
 TCGv t1 = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 
 tcg_gen_qemu_ld_tl(t1, t0, ctx->mem_idx, memop);
 gen_set_gpr(a->rd, t1);
@@ -180,6 +181,7 @@ static bool gen_store(DisasContext *ctx, arg_sb *a, MemOp 
memop)
 TCGv dat = tcg_temp_new();
 gen_get_gpr(t0, a->rs1);
 tcg_gen_addi_tl(t0, t0, a->imm);
+gen_pm_adjust_address(ctx, t0, t0);
 gen_get_gpr(dat, a->rs2);
 
 tcg_gen_qemu_st_tl(dat, t0, ctx->mem_idx, memop);
diff --git a/target/riscv/translate.c b/target/riscv/translate.c
index 2f9f5ccc62..2e815a5912 100644
--- a/target/riscv/translate.c
+++ b/target/riscv/translate.c
@@ -89,6 +89,16 @@ static void gen_nanbox_s(TCGv_i64 out, TCGv_i64 in)
 tcg_gen_ori_i64(out, in, MAKE_64BIT_MASK(32, 32));
 }
 
+/*
+ * Temp stub: generates address adjustment for PointerMasking
+ */
+static void gen_pm_adjust_address(DisasContext *s,
+  TCGv_i64  dst,
+  TCGv_i64  src)
+{
+tcg_gen_mov_i64(dst, src);
+}
+
 /*
  * A narrow n-bit operation, where n < FLEN, checks that input operands
  * are correctly Nan-boxed, i.e., all upper FLEN - n bits are 1.
-- 
2.20.1




[PATCH v8 3/6] [RISCV_PM] Print new PM CSRs in QEMU logs

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.c | 25 +
 1 file changed, 25 insertions(+)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index c04911ec05..0682410f5d 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -287,6 +287,31 @@ static void riscv_cpu_dump_state(CPUState *cs, FILE *f, 
int flags)
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "htval ", env->htval);
 qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mtval2 ", env->mtval2);
 }
+if (riscv_has_ext(env, RVJ)) {
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mmte", env->mmte);
+switch (env->priv) {
+case PRV_U:
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "upmbase ",
+ env->upmbase);
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "upmmask ",
+ env->upmmask);
+break;
+case PRV_S:
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "spmbase ",
+ env->spmbase);
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "spmmask ",
+ env->spmmask);
+break;
+case PRV_M:
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mpmbase ",
+ env->mpmbase);
+qemu_fprintf(f, " %s " TARGET_FMT_lx "\n", "mpmmask ",
+ env->mpmmask);
+break;
+default:
+g_assert_not_reached();
+}
+}
 #endif
 
 for (i = 0; i < 32; i++) {
-- 
2.20.1




[PATCH v8 1/6] [RISCV_PM] Add J-extension into RISC-V

2021-04-27 Thread Alexey Baturo
Signed-off-by: Alexey Baturo 
Reviewed-by: Richard Henderson 
Reviewed-by: Alistair Francis 
---
 target/riscv/cpu.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
index 0a33d387ba..0ea9fc65c8 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -72,6 +72,7 @@
 #define RVS RV('S')
 #define RVU RV('U')
 #define RVH RV('H')
+#define RVJ RV('J')
 
 /* S extension denotes that Supervisor mode exists, however it is possible
to have a core that support S mode but does not have an MMU and there
@@ -291,6 +292,7 @@ struct RISCVCPU {
 bool ext_s;
 bool ext_u;
 bool ext_h;
+bool ext_j;
 bool ext_v;
 bool ext_counters;
 bool ext_ifencei;
-- 
2.20.1




[Bug 1926044] Re: QEMU-user doesn't report HWCAP2_MTE

2021-04-27 Thread Richard Henderson
https://patchew.org/QEMU/20210427214108.88503-1-richard.hender...@linaro.org/

This has missed 6.0, but should be acceptable to roll into 6.0.1.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1926044

Title:
  QEMU-user doesn't report HWCAP2_MTE

Status in QEMU:
  In Progress

Bug description:
  Reproducible on ffa090bc56e73e287a63261e70ac02c0970be61a

  Host Debian 5.10.24 x86_64 GNU

  Configured with "configure --disable-system --enable-linux-user
  --static"

  This one works and prints "OK" as expected:
  clang tests/tcg/aarch64/mte-3.c -target aarch64-linux-gnu  -fsanitize=memtag 
-march=armv8+memtag
  qemu-aarch64 --cpu max -L /usr/aarch64-linux-gnu ./a.out && echo OK

  
  This one fails and print "0":
  cat mytest.c
  #include 
  #include 

  #ifndef HWCAP2_MTE
  #define HWCAP2_MTE (1 << 18)
  #endif

  int main(int ac, char **av)
  {
  printf("%d\n", (int)(getauxval(AT_HWCAP2) & HWCAP2_MTE));
  }

  
  clang mytest.c -target aarch64-linux-gnu  -fsanitize=memtag 
-march=armv8+memtag
  qemu-aarch64 --cpu max -L /usr/aarch64-linux-gnu ./a.out

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1926044/+subscriptions



Re: socket.c added support for unix domain socket datagram transport

2021-04-27 Thread Stefano Brivio
On Mon, 26 Apr 2021 13:56:40 +0100
Daniel P. Berrangé  wrote:

> On Fri, Apr 23, 2021 at 06:54:08PM +0200, Stefano Brivio wrote:
> > On Fri, 23 Apr 2021 17:21:38 +0100
> > Daniel P. Berrangé  wrote:  
> > > The current IP socket impl for the net socket backend uses SOCK_DGRAM,
> > > so from a consistency POV it feels sensible todo the same for UNIX
> > > sockets too.  
> > 
> > That's just for UDP though -- it also supports TCP with the "connect="
> > parameter, and given that a stream-oriented AF_UNIX socket behaves very
> > similarly, I recycled that parameter and just extended that bit of
> > documentation.
> >   
> > > None the less, your last point in particular about wanting to know
> > > about disconnects feels valid, and if its useful to you for UNIX
> > > sockets, then it ought to be useful for IP sockets too.
> > > 
> > > IOW, I wonder if  we should use DGRAM for UNIX sockets too by default
> > > to match current behaviour, but then also add a CLI option that allows
> > > choice of DGRAM vs STREAM, and wire that up for IP & UNIX sockets.  
> > 
> > The choice would only apply to AF_UNIX (that is, not to TCP and UDP).
> > 
> > The current situation isn't entirely consistent, because for TCP you
> > have "connect=", for UDP it's "udp=" or "mcast=", and I'm extending the
> > "connect=" case to support stream-oriented AF_UNIX, which I think is
> > consistent.
> > 
> > However, to have it symmetric for the datagram-oriented case
> > (UDP and AF_UNIX), ideally it should be changed to
> > "dgram=host:port|path" -- which I guess we can't do.
> > 
> > I see two alternatives:
> > 
> > 1.
> >   - "connect=" (TCP only)
> >   - "unix=path,type=dgram|stream"
> >   - "udp=" (UDP only)  
> 
> This doesn't work when you need the UNIX server to be a
> listener socket, as we've no way to express that, without
> adding yet another parameter.

Ah, right.

> > 2.
> >   - "connect=" (TCP and AF_UNIX stream)
> >   - "unix_dgram="
> >   - "udp=" (UDP only)  
> 
> Also needs
> 
>"listen=" (TCP and AF_UNIX stream)

Yes, I forgot about this here, but it's actually already in my patch
(see the changes to net_socket_listen_init() and documentation).

> "udp" has a corresponding optional "localaddr" for the sending
> address.
> 
> Also overloading "connect" means we have to parse the value
> to guess whether its a UNIX path or a IP addr:port pair.
> 
> I doubt people will have UNIX paths called "127.0.0.1:"
> but if we can avoid such ambiguity by design, it is better.

Agreed... I didn't actually consider that.

> > The major thing I like of 2. is that we save some code and a further
> > option, but other than that I don't have a strong preference.  
> 
> The pain we're feeling is largely because the design of the net
> option syntax is one of the oldest parts of QEMU and has only
> been very partially improved upon. It is certainly not using
> QAPI best practice, if we look at this:
> 
>   { 'struct': 'NetdevSocketOptions',
> 'data': {
>   '*fd':'str',
>   '*listen':'str',
>   '*connect':   'str',
>   '*mcast': 'str',
>   '*localaddr': 'str',
>   '*udp':   'str' } }
> 
> Then some things come to mind
> 
>  - We're not provinding a way to say what kind of "fd" is
>passed in - is it a UDP/TCP FD, is it a listener or
>client FD, is it unicast or multicast FD. Though QEMU
>can interogate the socket to discover this I guess.

Some form of probing was already added in commit 894022e61601 ("net:
check if the file descriptor is valid before using it"). Does qemu need
to care, though, once the socket is connected? That is, what would it
do with that information?

>  - All of the properties there except "fd" are encoding two values
>in a single property - address + port. This is an anti-pattern
> 
>  - No support for ipv4=on|off and ipv6=on|off flags to control
>dual-stack usage.

I wonder if this needs to be explicit -- it might simply derive from
addressing.

>  - Redundancy of separate parameters for "mcast" and "udp" when
>it is distinguishable based on the address given AFAIR.

Strictly speaking, for IPv4, addresses are reserved for multicast usage
(by RFC 5771), but as far as I can tell, also other addresses could
legitimately be used for multicast. I've never seen that in practice
and it's unlikely to be of any use, though.

For IPv6, things seem to be defined more strictly (RFC 5771 and
updates).

All in all, yes, I guess inferring this from the address would make the
usage more practical.

>  - No support for UNIX sockets
> 
> 
> The "right" way to fix most of this long term is a radical change
> to introduce use of the SocketAddress struct.
> 
> I could envisage something like this
> 
>   { 'enum': 'NetdevSocketMode',
> 'data':  ['dgram', 'client', 'server'] }
> 
>   { 'struct': 'NetdevSocketOptions',
> 'data': {
>   'addr':  'SocketAddress',
>   '*localaddr': 'SocketAddress',
>   '*mode':  'NetdevSocketMode' } }
> 
> 

Re: socket.c added support for unix domain socket datagram transport

2021-04-27 Thread Stefano Brivio
On Mon, 26 Apr 2021 13:14:48 +0200
Ralph Schmieder  wrote:

> > On Apr 23, 2021, at 18:39, Stefano Brivio 
> > wrote:
> > 
> > [...]
> >
> > Okay, so it doesn't seem to fit your case, but this specific point
> > is where you actually have a small advantage using a stream-oriented
> > socket. If you receive a packet and have a smaller receive buffer,
> > you can read the length of the packet from the vnet header and then
> > read the rest of the packet at a later time.
> > 
> > With a datagram-oriented socket, you would need to know the maximum
> > packet size in advance, and use a receive buffer that's large
> > enough to contain it, because if you don't, you'll discard data.  
> 
> For me, the maximum packet size is a jumbo frame (e.g. 9x1024) anyway
> -- everything must fit into an atomic write of that size.

Well, the day you want to do some batching... ;) but sure, I see your
point.

> > [...]
> > 
> > On a side note, I wonder why you need two named sockets instead of
> > one -- I mean, they're bidirectional...  
> 
> Hmm... each peer needs to send unsolicited frames/packets to the
> other end... and thus needs to bind to their socket.  Pretty much for
> the same reason as the UDP transport requires you to specify a local
> and a remote 5-tuple.  Even though for AF_INET, the local port does
> not have to be specified, the OS would assign an ephemeral port to
> make it unique. Am I missing something?

I see your point now. Well, I think it's different from the AF_INET case
due to the way AF_UNIX works: UNIX domain sockets don't necessarily
need to make the endpoint known or visible, see a more detailed
explanation at:

https://comp.unix.admin.narkive.com/AhAOKP1s/lsof-find-both-endpoints-of-a-unix-socket

Even though, nowadays on Linux:

$ nc -luU my_path & (sleep 1; nc.openbsd -uU my_path & lsof +E -aUc nc)
[1] 373285
COMMAND  PIDUSER   FD   TYPE DEVICE SIZE/OFFNODE NAME
nc373285 sbrivio3u  unix 0x4076431a  0t0 3957568 
my_path type=DGRAM ->INO=3956394 373288,nc.openbs,4u
nc.openbs 373288 sbrivio4u  unix 0xf5b2e2e1  0t0 3956394 
/tmp/nc.C0whUu type=DGRAM ->INO=3957568 373285,nc,3u

for datagram sockets, the endpoint is exported, and lsof can report that
the endpoint for "my_path" here (-luU binds to a UNIX domain datagram
socket, -uU connects to it). With a stream socket, by the way:

$ nc -lU my_path & (sleep 1; nc.openbsd -U my_path & lsof +E -aUc nc)
[1] 375445
COMMAND  PIDUSER   FD   TYPE DEVICE SIZE/OFFNODE NAME
nc375445 sbrivio3u  unix 0x53abf57c  0t0 3969787 
my_path type=STREAM
nc375445 sbrivio4u  unix 0x1960c1ef  0t0 3969788 
my_path type=STREAM ->INO=3970624 375448,nc.openbs,3u
nc.openbs 375448 sbrivio3u  unix 0x0538fa63  0t0 3970624 
type=STREAM ->INO=3969788 375445,nc,4u

so I think it should be optional. Even with datagram sockets, just like
the example above (I'm not suggesting that you do this, it's just
another possible choice), only one peer needs to bind to a named
socket, and yet they can exchange data.

> Another thing: on Windows, there's a AF_UNIX/SOCK_STREAM
> implementation... So, technically it should be possible to use that
> code path on Windows, too.  Not a windows guy, though... So, can't
> say whether it would simply work or not:
> 
> https://devblogs.microsoft.com/commandline/af_unix-comes-to-windows/

Thanks for the pointer. I can't test this, so I wouldn't remove that
#ifndef, but perhaps I could add a link to this, in case somebody needs
it and stumbles upon this code path.

-- 
Stefano




Re: [PATCH v3 07/33] block/nbd: simplify waking of nbd_co_establish_connection()

2021-04-27 Thread Roman Kagan
On Fri, Apr 16, 2021 at 11:08:45AM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Instead of connect_bh, bh_ctx and wait_connect fields we can live with
> only one link to waiting coroutine, protected by mutex.
> 
> So new logic is:
> 
> nbd_co_establish_connection() sets wait_co under mutex, release the
> mutex and do yield(). Note, that wait_co may be scheduled by thread
> immediately after unlocking the mutex. Still, in main thread (or
> iothread) we'll not reach the code for entering the coroutine until the
> yield() so we are safe.
> 
> Both connect_thread_func() and nbd_co_establish_connection_cancel() do
> the following to handle wait_co:
> 
> Under mutex, if thr->wait_co is not NULL, call aio_co_wake() (which
> never tries to acquire aio context since previous commit, so we are
> safe to do it under thr->mutex) and set thr->wait_co to NULL.
> This way we protect ourselves of scheduling it twice.
> 
> Also this commit make nbd_co_establish_connection() less connected to
> bs (we have generic pointer to the coroutine, not use s->connection_co
> directly). So, we are on the way of splitting connection API out of
> nbd.c (which is overcomplicated now).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/nbd.c | 49 +
>  1 file changed, 9 insertions(+), 40 deletions(-)
> 
> diff --git a/block/nbd.c b/block/nbd.c
> index d67556c7ee..e1f39eda6c 100644
> --- a/block/nbd.c
> +++ b/block/nbd.c
> @@ -87,12 +87,6 @@ typedef enum NBDConnectThreadState {
>  typedef struct NBDConnectThread {
>  /* Initialization constants */
>  SocketAddress *saddr; /* address to connect to */
> -/*
> - * Bottom half to schedule on completion. Scheduled only if bh_ctx is not
> - * NULL
> - */
> -QEMUBHFunc *bh_func;
> -void *bh_opaque;
>  
>  /*
>   * Result of last attempt. Valid in FAIL and SUCCESS states.
> @@ -101,10 +95,10 @@ typedef struct NBDConnectThread {
>  QIOChannelSocket *sioc;
>  Error *err;
>  
> -/* state and bh_ctx are protected by mutex */
>  QemuMutex mutex;
> +/* All further fields are protected by mutex */
>  NBDConnectThreadState state; /* current state of the thread */
> -AioContext *bh_ctx; /* where to schedule bh (NULL means don't schedule) 
> */
> +Coroutine *wait_co; /* nbd_co_establish_connection() wait in yield() */
>  } NBDConnectThread;
>  
>  typedef struct BDRVNBDState {
> @@ -138,7 +132,6 @@ typedef struct BDRVNBDState {
>  char *x_dirty_bitmap;
>  bool alloc_depth;
>  
> -bool wait_connect;
>  NBDConnectThread *connect_thread;
>  } BDRVNBDState;
>  
> @@ -374,15 +367,6 @@ static bool nbd_client_connecting_wait(BDRVNBDState *s)
>  return qatomic_load_acquire(&s->state) == NBD_CLIENT_CONNECTING_WAIT;
>  }
>  
> -static void connect_bh(void *opaque)
> -{
> -BDRVNBDState *state = opaque;
> -
> -assert(state->wait_connect);
> -state->wait_connect = false;
> -aio_co_wake(state->connection_co);
> -}
> -
>  static void nbd_init_connect_thread(BDRVNBDState *s)
>  {
>  s->connect_thread = g_new(NBDConnectThread, 1);
> @@ -390,8 +374,6 @@ static void nbd_init_connect_thread(BDRVNBDState *s)
>  *s->connect_thread = (NBDConnectThread) {
>  .saddr = QAPI_CLONE(SocketAddress, s->saddr),
>  .state = CONNECT_THREAD_NONE,
> -.bh_func = connect_bh,
> -.bh_opaque = s,
>  };
>  
>  qemu_mutex_init(&s->connect_thread->mutex);
> @@ -429,11 +411,9 @@ static void *connect_thread_func(void *opaque)
>  switch (thr->state) {
>  case CONNECT_THREAD_RUNNING:
>  thr->state = ret < 0 ? CONNECT_THREAD_FAIL : CONNECT_THREAD_SUCCESS;
> -if (thr->bh_ctx) {
> -aio_bh_schedule_oneshot(thr->bh_ctx, thr->bh_func, 
> thr->bh_opaque);
> -
> -/* play safe, don't reuse bh_ctx on further connection attempts 
> */
> -thr->bh_ctx = NULL;
> +if (thr->wait_co) {
> +aio_co_schedule(NULL, thr->wait_co);
> +thr->wait_co = NULL;
>  }
>  break;
>  case CONNECT_THREAD_RUNNING_DETACHED:
> @@ -487,20 +467,14 @@ nbd_co_establish_connection(BlockDriverState *bs, Error 
> **errp)
>  abort();
>  }
>  
> -thr->bh_ctx = qemu_get_current_aio_context();
> +thr->wait_co = qemu_coroutine_self();
>  
>  qemu_mutex_unlock(&thr->mutex);
>  
> -
>  /*
>   * We are going to wait for connect-thread finish, but
>   * nbd_client_co_drain_begin() can interrupt.
> - *
> - * Note that wait_connect variable is not visible for connect-thread. It
> - * doesn't need mutex protection, it used only inside home aio context of
> - * bs.
>   */
> -s->wait_connect = true;
>  qemu_coroutine_yield();
>  
>  qemu_mutex_lock(&thr->mutex);
> @@ -555,24 +529,19 @@ static void 
> nbd_co_establish_connection_cancel(BlockDriverState *bs)
>  {
>  BDRVNBDState *s = bs->opaque;
>  NBDConnectThread *

[PATCH] linux-user/aarch64: Enable hwcap for RND, BTI, and MTE

2021-04-27 Thread Richard Henderson
These three features are already enabled by TCG, but are missing
their hwcap bits.  Update HWCAP2 from linux v5.12.

Cc: qemu-sta...@nongnu.org (for 6.0.1)
Buglink: https://bugs.launchpad.net/bugs/1926044
Signed-off-by: Richard Henderson 
---
 linux-user/elfload.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index c6731013fd..fc9c4f12be 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -586,6 +586,16 @@ enum {
 ARM_HWCAP2_A64_SVESM4   = 1 << 6,
 ARM_HWCAP2_A64_FLAGM2   = 1 << 7,
 ARM_HWCAP2_A64_FRINT= 1 << 8,
+ARM_HWCAP2_A64_SVEI8MM  = 1 << 9,
+ARM_HWCAP2_A64_SVEF32MM = 1 << 10,
+ARM_HWCAP2_A64_SVEF64MM = 1 << 11,
+ARM_HWCAP2_A64_SVEBF16  = 1 << 12,
+ARM_HWCAP2_A64_I8MM = 1 << 13,
+ARM_HWCAP2_A64_BF16 = 1 << 14,
+ARM_HWCAP2_A64_DGH  = 1 << 15,
+ARM_HWCAP2_A64_RNG  = 1 << 16,
+ARM_HWCAP2_A64_BTI  = 1 << 17,
+ARM_HWCAP2_A64_MTE  = 1 << 18,
 };
 
 #define ELF_HWCAP   get_elf_hwcap()
@@ -640,6 +650,9 @@ static uint32_t get_elf_hwcap2(void)
 GET_FEATURE_ID(aa64_dcpodp, ARM_HWCAP2_A64_DCPODP);
 GET_FEATURE_ID(aa64_condm_5, ARM_HWCAP2_A64_FLAGM2);
 GET_FEATURE_ID(aa64_frint, ARM_HWCAP2_A64_FRINT);
+GET_FEATURE_ID(aa64_rndr, ARM_HWCAP2_A64_RNG);
+GET_FEATURE_ID(aa64_bti, ARM_HWCAP2_A64_BTI);
+GET_FEATURE_ID(aa64_mte, ARM_HWCAP2_A64_MTE);
 
 return hwcaps;
 }
-- 
2.25.1




Re: [PATCH v8 08/11] hw/core: deprecate old reset functions and introduce new ones

2021-04-27 Thread Eduardo Habkost
On Tue, Apr 27, 2021 at 02:21:28PM +0200, Philippe Mathieu-Daudé wrote:
> On 1/23/20 2:28 PM, Damien Hedde wrote:
> > Deprecate device_legacy_reset(), qdev_reset_all() and
> > qbus_reset_all() to be replaced by new functions
> > device_cold_reset() and bus_cold_reset() which uses resettable API.
> > 
> > Also introduce resettable_cold_reset_fn() which may be used as a
> > replacement for qdev_reset_all_fn and qbus_reset_all_fn().
> > 
> > Following patches will be needed to look at legacy reset call sites
> > and switch to resettable api. The legacy functions will be removed
> > when unused.
> > 
> > Signed-off-by: Damien Hedde 
> > Reviewed-by: Philippe Mathieu-Daudé 
> > Reviewed-by: Peter Maydell 
> > Reviewed-by: Richard Henderson 
> > Tested-by: Philippe Mathieu-Daudé 
> > ---
> >  include/hw/qdev-core.h  | 27 +++
> >  include/hw/resettable.h |  9 +
> >  hw/core/bus.c   |  5 +
> >  hw/core/qdev.c  |  5 +
> >  hw/core/resettable.c|  5 +
> >  5 files changed, 51 insertions(+)
> > 
> > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> > index 1b4b420617..b84fcc32bf 100644
> > --- a/include/hw/qdev-core.h
> > +++ b/include/hw/qdev-core.h
> > @@ -406,6 +406,13 @@ int qdev_walk_children(DeviceState *dev,
> > qdev_walkerfn *post_devfn, qbus_walkerfn 
> > *post_busfn,
> > void *opaque);
> >  
> > +/**
> > + * @qdev_reset_all:
> > + * Reset @dev. See @qbus_reset_all() for more details.
> > + *
> > + * Note: This function is deprecated and will be removed when it becomes 
> > unused.
> > + * Please use device_cold_reset() now.
> > + */
> >  void qdev_reset_all(DeviceState *dev);
> >  void qdev_reset_all_fn(void *opaque);
> >  
> > @@ -418,10 +425,28 @@ void qdev_reset_all_fn(void *opaque);
> >   * hard reset means that qbus_reset_all will reset all state of the device.
> >   * For PCI devices, for example, this will include the base address 
> > registers
> >   * or configuration space.
> > + *
> > + * Note: This function is deprecated and will be removed when it becomes 
> > unused.
> > + * Please use bus_cold_reset() now.
> 
> Some time passed, so looking at this with some retrospective.
> 
> If there is an effort to introduce a new API replacing another one,
> we should try convert all the uses of the old API to the new one,
> instead of declaring it legacy.
> 
> Declare an API legacy/deprecated should be the last resort if there
> is no way to remove it. I'd recommend to move the deprecated/legacy
> declarations in a separate header, with the '_legacy' suffix.
> 
> Else:
> 
> 1/ we never finish API conversions,
> 2/ the new API might not be ready for all the legacy API use cases,
> 3/ we end up having to maintain 2 different APIs.
> 
> 
> So the recommendation is to use bus_cold_reset(), but it isn't
> used anywhere...:
> 
> $ git grep bus_cold_reset
> docs/devel/reset.rst:64:- ``bus_cold_reset()``
> hw/core/bus.c:73:void bus_cold_reset(BusState *bus)
> include/hw/qdev-core.h:715: * Please use bus_cold_reset() now.
> include/hw/qdev-core.h:728: * bus_cold_reset:
> include/hw/qdev-core.h:733:void bus_cold_reset(BusState *bus);
> 
> IMHO we shouldn't add new public prototypes without callers.

I agree.  We should make at least some effort to convert code to
the new API, if only to serve as reference for people doing the
conversion.  I'm surprised that a new function was added more
than a year ago and nobody is using it.

What happened here?  Was there some plan to convert existing code
but it was abandoned?

> 
> I see it is similar to device_cold_reset(), but TBH I'm scared
> to be the first one using it.
> 
> Regards,
> 
> Phil.
> 
> >   */
> >  void qbus_reset_all(BusState *bus);
> >  void qbus_reset_all_fn(void *opaque);
> >  
> > +/**
> > + * device_cold_reset:
> > + * Reset device @dev and perform a recursive processing using the 
> > resettable
> > + * interface. It triggers a RESET_TYPE_COLD.
> > + */
> > +void device_cold_reset(DeviceState *dev);
> > +
> > +/**
> > + * bus_cold_reset:
> > + *
> > + * Reset bus @bus and perform a recursive processing using the resettable
> > + * interface. It triggers a RESET_TYPE_COLD.
> > + */
> > +void bus_cold_reset(BusState *bus);
> 

-- 
Eduardo




Re: [PATCH] i386/cpu: Remove the deprecated cpu model 'Icelake-Client'

2021-04-27 Thread Eduardo Habkost
On Thu, Apr 22, 2021 at 05:42:16PM +0800, Robert Hoo wrote:
> As it's been marked deprecated since v5.2, now I think it's time remove it
> from code.
> 
> Signed-off-by: Robert Hoo 

Thanks!  There's only one issue: we need to update
docs/system/deprecated.rst and docs/system/removed-features.rst
when removing the CPU model.

-- 
Eduardo




Re: [PATCH] Deprecate pmem=on with non-DAX capable backend file

2021-04-27 Thread Eduardo Habkost
On Mon, Jan 11, 2021 at 03:33:32PM -0500, Igor Mammedov wrote:
> It is not safe to pretend that emulated NVDIMM supports
> persistence while backend actually failed to enable it
> and used non-persistent mapping as fall back.
> Instead of falling-back, QEMU should be more strict and
> error out with clear message that it's not supported.
> So if user asks for persistence (pmem=on), they should
> store backing file on NVDIMM.
> 
> Signed-off-by: Igor Mammedov 
> Reviewed-by: Philippe Mathieu-Daudé 

I'm queueing this for 6.1, after changing "since 6.0" to "since 6.1".

Sorry for letting it fall through the cracks.

-- 
Eduardo




Re: [PATCH] vmbus: Don't make QOM property registration conditional

2021-04-27 Thread Eduardo Habkost
On Sun, Apr 25, 2021 at 02:21:38PM +0200, Maciej S. Szmigiero wrote:
> On 11.10.2020 01:30, Maciej S. Szmigiero wrote:
> > On 09.10.2020 23:33, Eduardo Habkost wrote:
> > > On Fri, Oct 09, 2020 at 11:05:47PM +0200, Maciej S. Szmigiero wrote:
> > > > On 09.10.2020 22:07, Eduardo Habkost wrote:
> > > > > Having properties registered conditionally makes QOM type
> > > > > introspection difficult.  Instead of skipping registration of the
> > > > > "instanceid" property, always register the property but validate
> > > > > its value against the instance id required by the class.
> > > > > 
> > > > > Signed-off-by: Eduardo Habkost 
> > > > > ---
> > > > > Note: due to the lack of concrete vmbus-dev subclasses in the
> > > > > QEMU tree, this patch couldn't be tested.
> > > > 
> > > > Will test it tomorrow since I have a VMBus device implementation.
> > > 
> > > Thanks!
> > > 
> > 
> > Tested the patch with a hv-balloon device and is seems to work okay, so:
> > Acked-by: Maciej S. Szmigiero 
> > 
> 
> I see this patch wasn't picked up - it still makes sense and applies
> cleanly to the current git, so I think it should be picked up.

I'm queueing it for 6.1, thanks for the reminder!

-- 
Eduardo




[Bug 1893040] Re: External modules retreval using Go1.15 on s390x appears to have checksum and ECDSA verification issues

2021-04-27 Thread Ruixin Bao
Hello @davidhildenbrand, I have been looking into this bug recently. So
far, I noticed a few things:

1: Similarly as described in comment #5, I also had success building the
go file described in the reproducing steps in #4 using Ubunutu-20.04
with recent qemu-system-s390x (I did it 1 - 2 weeks ago, so it is likely
qemu-6.0rc2 or rc3)

2: Similarly as described in commment #9, when qemu-user-static is used,
there are "ECDSA verification failure". The failure is using multiarch
/qemu-user-static with qemu-s390x 6.0.0-rc3 statically built from source
and copied in when building the container

3: Debugging in a container has been really difficult for me, so I used
chroot and debootstrap to emulate a full s390x file system on a x86 host
and copy the qemu-s390x binary in. I find that I can still reproduce the
error similarly as the container. However, I also find that if I turn
the vector instruction off with vx=off and split the go command into
multiple steps, I am no longer able to reproduce the error. The reason
for splitting the commands is that it looks like go build first calls go
mod tidy, then calls go tool compile to compile the program. Through
experimentation, those appear to call some other binary so the vx=off is
dropped.

 Build steps 
root@skewered1:~/example.com/hello# ls  
   
go.mod  hello.go
   
root@skewered1:~/example.com/hello# vim go.mod  


root@skewered1:~/example.com/hello# ls  
   
go.mod  hello.go
   
root@skewered1:~/example.com/hello# uname -a
   
Linux xxx (hidden) 5.4.0-72-generic #80-Ubuntu SMP Mon Apr 12 17:35:00 UTC 2021 
s390x GNU/Linux  
root@skewered1:~/example.com/hello# file /usr/bin/qemu-s390x-6.0rc5-static  
   
/usr/bin/qemu-s390x-6.0rc5-static: ELF 64-bit LSB shared object, x86-64, 
version 1 (GNU/Linux), dynamically linked, Bui
ldID[sha1]=28d90b247aa25813da5b24d07707863f089a78eb, for GNU/Linux 3.2.0, 
stripped 
root@skewered1:~/example.com/hello# /usr/bin/qemu-s390x-6.0rc5-static --version
qemu-s390x version 5.2.95 (v6.0.0-rc5) 
Copyright (c) 2003-2021 Fabrice Bellard and the QEMU Project developers
root@skewered1:~/example.com/hello#
root@skewered1:~/example.com/hello# go version  

  
go version go1.15.11 linux/s390x
   
root@skewered1:~/example.com/hello# 
   
root@skewered1:~/example.com/hello# which go

/usr/local/go/bin/go
root@skewered1:~/example.com/hello# /usr/bin/qemu-s390x-6.0rc5-static 
/usr/local/go/bin/go build . 
go: finding module for package rsc.io/quote
hello.go:4:5: module rsc.io/quote: Get 
"https://proxy.golang.org/rsc.io/quote/@v/list": tls: invalid signature by the 
server certificate: ECDSA verification failure
root@skewered1:~/example.com/hello# /usr/bin/qemu-s390x-6.0rc5-static -cpu 
qemu,vx=off /usr/local/go/bin/go mod tidy 
go: finding module for package rsc.io/quote
go: downloading rsc.io/quote v1.5.2
go: found rsc.io/quote in rsc.io/quote v1.5.2
go: downloading rsc.io/sampler v1.3.0
go: downloading golang.org/x/text v0.0.0-20170915032832-14c0d48ead0c
root@skewered1:~/example.com/hello# /usr/bin/qemu-s390x-6.0rc5-static -cpu 
qemu,vx=off /usr/local/go/bin/go build .  
root@skewered1:~/example.com/hello# ls
go.mod  go.sum  hello  hello.go
root@skewered1:~/example.com/hello# file hello
hello: ELF 64-bit MSB executable, IBM S/390, version 1 (SYSV), statically 
linked, not stripped
root@skewered1:~/example.com/hello# ./hello 
Hello, world.

4: The above findings make me think that there is some discrepancy
between vector instructions handling for qemu user mode vs system mode.
Additionally, running tests with vx=off in go/src/crypto/ecdsa will make
the test pass while without vx=off, there remains to be a problem.
Currently, I am looking into the go source code hoping to narrow down
the problem. It looks like the difference (between qemu-user and s390x
native host) happens during initTable() function at
crypto/elliptic/p256_s390x.go.

I hope the above findings make se

Re: [RFC PATCH] hw/s390x/ccw: Register qbus type in abstract TYPE_CCW_DEVICE parent

2021-04-27 Thread Eric Farman
On Sat, 2021-04-24 at 16:53 +0200, Philippe Mathieu-Daudé wrote:
> Instead of having all TYPE_CCW_DEVICE children set the bus type to
> TYPE_VIRTUAL_CSS_BUS, do it once in the abstract parent.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
> RFC because I don't know these devices, maybe there is a reason
> for setting the bus type in the children (but it should be documented
> IMO).

I don't know the history behind it, but don't see an obvious reason for
doing it the current way. I sure do like the end result.

Acked-by: Eric Farman 

> ---
>  hw/s390x/ccw-device.h | 1 +
>  hw/s390x/3270-ccw.c   | 1 -
>  hw/s390x/ccw-device.c | 1 +
>  hw/s390x/s390-ccw.c   | 2 --
>  hw/s390x/virtio-ccw.c | 1 -
>  5 files changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/s390x/ccw-device.h b/hw/s390x/ccw-device.h
> index 832c78cd421..6dff95225df 100644
> --- a/hw/s390x/ccw-device.h
> +++ b/hw/s390x/ccw-device.h
> @@ -14,6 +14,7 @@
>  #include "qom/object.h"
>  #include "hw/qdev-core.h"
>  #include "hw/s390x/css.h"
> +#include "hw/s390x/css-bridge.h"
>  
>  struct CcwDevice {
>  DeviceState parent_obj;
> diff --git a/hw/s390x/3270-ccw.c b/hw/s390x/3270-ccw.c
> index f3e7342b1e8..0757af60632 100644
> --- a/hw/s390x/3270-ccw.c
> +++ b/hw/s390x/3270-ccw.c
> @@ -159,7 +159,6 @@ static void
> emulated_ccw_3270_class_init(ObjectClass *klass, void *data)
>  DeviceClass *dc = DEVICE_CLASS(klass);
>  
>  device_class_set_props(dc, emulated_ccw_3270_properties);
> -dc->bus_type = TYPE_VIRTUAL_CSS_BUS;
>  dc->realize = emulated_ccw_3270_realize;
>  dc->hotpluggable = false;
>  set_bit(DEVICE_CATEGORY_DISPLAY, dc->categories);
> diff --git a/hw/s390x/ccw-device.c b/hw/s390x/ccw-device.c
> index c9707110e9c..95f269ab441 100644
> --- a/hw/s390x/ccw-device.c
> +++ b/hw/s390x/ccw-device.c
> @@ -59,6 +59,7 @@ static void ccw_device_class_init(ObjectClass
> *klass, void *data)
>  k->refill_ids = ccw_device_refill_ids;
>  device_class_set_props(dc, ccw_device_properties);
>  dc->reset = ccw_device_reset;
> +dc->bus_type = TYPE_VIRTUAL_CSS_BUS;
>  }
>  
>  const VMStateDescription vmstate_ccw_dev = {
> diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
> index b497571863f..cb49f380a6b 100644
> --- a/hw/s390x/s390-ccw.c
> +++ b/hw/s390x/s390-ccw.c
> @@ -177,10 +177,8 @@ static void s390_ccw_instance_init(Object *obj)
>  
>  static void s390_ccw_class_init(ObjectClass *klass, void *data)
>  {
> -DeviceClass *dc = DEVICE_CLASS(klass);
>  S390CCWDeviceClass *cdc = S390_CCW_DEVICE_CLASS(klass);
>  
> -dc->bus_type = TYPE_VIRTUAL_CSS_BUS;
>  cdc->realize = s390_ccw_realize;
>  cdc->unrealize = s390_ccw_unrealize;
>  }
> diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
> index 8195f3546e4..71ec2bdcc31 100644
> --- a/hw/s390x/virtio-ccw.c
> +++ b/hw/s390x/virtio-ccw.c
> @@ -1235,7 +1235,6 @@ static void
> virtio_ccw_device_class_init(ObjectClass *klass, void *data)
>  k->unplug = virtio_ccw_busdev_unplug;
>  dc->realize = virtio_ccw_busdev_realize;
>  dc->unrealize = virtio_ccw_busdev_unrealize;
> -dc->bus_type = TYPE_VIRTUAL_CSS_BUS;
>  device_class_set_parent_reset(dc, virtio_ccw_reset, &vdc-
> >parent_reset);
>  }
>  




Re: [PATCH] target/mips: Migrate missing CPU fields

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/24/21 12:00 AM, Philippe Mathieu-Daudé wrote:
> Add various missing fields to the CPU migration vmstate:
> 
> - CP0_VPControl & CP0_GlobalNumber  (01bc435b44b 2016-02-03)
> - CMGCRBase (c870e3f52ca 2016-03-15)
> - CP0_ErrCtl(0d74a222c27 2016-03-25)
> - MXU GPR[] & CR(eb5559f67dc 2018-10-18)
> - R5900 128-bit upper half  (a168a796e1c 2019-01-17)
> 
> This is a migration break.
> 
> Fixes: 01bc435b44b ("target-mips: implement R6 multi-threading")
> Fixes: c870e3f52ca ("target-mips: add CMGCRBase register")
> Fixes: 0d74a222c27 ("target-mips: make ITC Configuration Tags accessible to 
> the CPU")
> Fixes: eb5559f67dc ("target/mips: Introduce MXU registers")
> Fixes: a168a796e1c ("target/mips: Introduce 32 R5900 multimedia registers")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  target/mips/machine.c | 21 +++--
>  1 file changed, 15 insertions(+), 6 deletions(-)

Thanks, applied to mips-next.



Re: [PATCH] target/mips: Remove spurious LOG_UNIMP of MTHC0 opcode

2021-04-27 Thread Philippe Mathieu-Daudé
On Thu, Apr 22, 2021 at 10:10 AM Philippe Mathieu-Daudé  wrote:
>
> When running with '-d unimp' all MTHC0 opcode executed
> are logged as unimplemented... Add the proper 'return'
> statement missed from commit 5204ea79ea7.
>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  target/mips/translate.c | 1 +
>  1 file changed, 1 insertion(+)

Thanks, applied to mips-next.



Re: [PATCH] target/mips: Add missing CP0 check to nanoMIPS RDPGPR / WRPGPR opcodes

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/21/21 8:50 PM, Philippe Mathieu-Daudé wrote:
> Per the nanoMIPS32 Instruction Set Technical Reference Manual,
> Revision 01.01, Chapter 3. "Instruction Definitions":
> 
> The Read/Write Previous GPR opcodes "require CP0 privilege".
> 
> Add the missing CP0 checks.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  target/mips/translate.c | 2 ++
>  1 file changed, 2 insertions(+)

Thanks, applied to mips-next.



Re: [PATCH] target/mips: Fix CACHEE opcode (CACHE using EVA addressing)

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/20/21 7:54 PM, Philippe Mathieu-Daudé wrote:
> The CACHEE opcode "requires CP0 privilege".
> 
> The pseudocode checks in the ISA manual is:
> 
> if is_eva and not C0.Config5.EVA:
>   raise exception('RI')
> 
> if not IsCoprocessor0Enabled():
>   raise coprocessor_exception(0)
> 
> Add the missing checks.
> 
> Inspired-by: Richard Henderson 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  target/mips/translate.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)

Thanks, applied to mips-next.



[RFC PATCH 2/2] hw/sparc: Allow building without the leon3 machine

2021-04-27 Thread Philippe Mathieu-Daudé
When building without the leon3 machine, we get this link failure:

  /usr/bin/ld: target_sparc_int32_helper.c.o: in function `leon3_irq_manager':
  target/sparc/int32_helper.c:172: undefined reference to `leon3_irq_ack'

This is because the leon3_irq_ack() is declared in hw/sparc/leon3.c,
which is only build when CONFIG_LEON3 is selected.

Fix by moving the leon3_cache_control_int() / leon3_irq_manager()
(which are specific to the leon3 machine) to hw/sparc/leon3.c.
Move the trace events along (but don't rename them).

leon3_irq_ack() is now locally used, declare it static to reduce
its scope.

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC: The problem is we have hardware specific code in the
architectural translation code. I wish there was a better
alternative rather than moving this code to hw/sparc/.
---
 target/sparc/cpu.h  |  6 --
 hw/sparc/leon3.c| 37 -
 target/sparc/int32_helper.c | 37 -
 hw/sparc/trace-events   |  2 ++
 target/sparc/trace-events   |  4 
 5 files changed, 38 insertions(+), 48 deletions(-)

diff --git a/target/sparc/cpu.h b/target/sparc/cpu.h
index 4b2290650be..ff8ae73002a 100644
--- a/target/sparc/cpu.h
+++ b/target/sparc/cpu.h
@@ -615,15 +615,9 @@ int cpu_cwp_inc(CPUSPARCState *env1, int cwp);
 int cpu_cwp_dec(CPUSPARCState *env1, int cwp);
 void cpu_set_cwp(CPUSPARCState *env1, int new_cwp);
 
-/* int_helper.c */
-void leon3_irq_manager(CPUSPARCState *env, void *irq_manager, int intno);
-
 /* sun4m.c, sun4u.c */
 void cpu_check_irqs(CPUSPARCState *env);
 
-/* leon3.c */
-void leon3_irq_ack(void *irq_manager, int intno);
-
 #if defined (TARGET_SPARC64)
 
 static inline int compare_masked(uint64_t x, uint64_t y, uint64_t mask)
diff --git a/hw/sparc/leon3.c b/hw/sparc/leon3.c
index 7e16eea9e67..98e3789cf84 100644
--- a/hw/sparc/leon3.c
+++ b/hw/sparc/leon3.c
@@ -137,7 +137,36 @@ static void main_cpu_reset(void *opaque)
 env->regbase[6] = s->sp;
 }
 
-void leon3_irq_ack(void *irq_manager, int intno)
+static void leon3_cache_control_int(CPUSPARCState *env)
+{
+uint32_t state = 0;
+
+if (env->cache_control & CACHE_CTRL_IF) {
+/* Instruction cache state */
+state = env->cache_control & CACHE_STATE_MASK;
+if (state == CACHE_ENABLED) {
+state = CACHE_FROZEN;
+trace_int_helper_icache_freeze();
+}
+
+env->cache_control &= ~CACHE_STATE_MASK;
+env->cache_control |= state;
+}
+
+if (env->cache_control & CACHE_CTRL_DF) {
+/* Data cache state */
+state = (env->cache_control >> 2) & CACHE_STATE_MASK;
+if (state == CACHE_ENABLED) {
+state = CACHE_FROZEN;
+trace_int_helper_dcache_freeze();
+}
+
+env->cache_control &= ~(CACHE_STATE_MASK << 2);
+env->cache_control |= (state << 2);
+}
+}
+
+static void leon3_irq_ack(void *irq_manager, int intno)
 {
 grlib_irqmp_ack((DeviceState *)irq_manager, intno);
 }
@@ -181,6 +210,12 @@ static void leon3_set_pil_in(void *opaque, int n, int 
level)
 }
 }
 
+static void leon3_irq_manager(CPUSPARCState *env, void *irq_manager, int intno)
+{
+leon3_irq_ack(irq_manager, intno);
+leon3_cache_control_int(env);
+}
+
 static void leon3_generic_hw_init(MachineState *machine)
 {
 ram_addr_t ram_size = machine->ram_size;
diff --git a/target/sparc/int32_helper.c b/target/sparc/int32_helper.c
index 817a463a179..d008dbdb65c 100644
--- a/target/sparc/int32_helper.c
+++ b/target/sparc/int32_helper.c
@@ -136,40 +136,3 @@ void sparc_cpu_do_interrupt(CPUState *cs)
 }
 #endif
 }
-
-#if !defined(CONFIG_USER_ONLY)
-static void leon3_cache_control_int(CPUSPARCState *env)
-{
-uint32_t state = 0;
-
-if (env->cache_control & CACHE_CTRL_IF) {
-/* Instruction cache state */
-state = env->cache_control & CACHE_STATE_MASK;
-if (state == CACHE_ENABLED) {
-state = CACHE_FROZEN;
-trace_int_helper_icache_freeze();
-}
-
-env->cache_control &= ~CACHE_STATE_MASK;
-env->cache_control |= state;
-}
-
-if (env->cache_control & CACHE_CTRL_DF) {
-/* Data cache state */
-state = (env->cache_control >> 2) & CACHE_STATE_MASK;
-if (state == CACHE_ENABLED) {
-state = CACHE_FROZEN;
-trace_int_helper_dcache_freeze();
-}
-
-env->cache_control &= ~(CACHE_STATE_MASK << 2);
-env->cache_control |= (state << 2);
-}
-}
-
-void leon3_irq_manager(CPUSPARCState *env, void *irq_manager, int intno)
-{
-leon3_irq_ack(irq_manager, intno);
-leon3_cache_control_int(env);
-}
-#endif
diff --git a/hw/sparc/trace-events b/hw/sparc/trace-events
index 355b07ae057..dfb53dc1a24 100644
--- a/hw/sparc/trace-events
+++ b/hw/sparc/trace-events
@@ -19,3 +19,5 @@ sun4m_iommu_bad_addr(uint64_t addr) "bad addr 0x%"PRIx64
 # leon3.c
 leon3_set_irq(int intno) "Set CPU IRQ %d"
 leon3_rese

[PATCH 1/2] hw/sparc: Allow building the leon3 machine stand-alone

2021-04-27 Thread Philippe Mathieu-Daudé
When building only the leon3 machine, we get this link failure:

  /usr/bin/ld: target_sparc_win_helper.c.o: in function `cpu_put_psr':
  target/sparc/win_helper.c:91: undefined reference to `cpu_check_irqs'

This is because cpu_check_irqs() is defined in hw/sparc/sun4m.c,
which is only built if the base sun4m machines are built (with
the CONFIG_SUN4M selector).

Fix by moving cpu_check_irqs() out of hw/sparc/sun4m.c and build
it unconditionally.

Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/sparc/irq.c   | 61 
 hw/sparc/sun4m.c | 32 ---
 hw/sparc/meson.build |  1 +
 3 files changed, 62 insertions(+), 32 deletions(-)
 create mode 100644 hw/sparc/irq.c

diff --git a/hw/sparc/irq.c b/hw/sparc/irq.c
new file mode 100644
index 000..e34639f266e
--- /dev/null
+++ b/hw/sparc/irq.c
@@ -0,0 +1,61 @@
+/*
+ * QEMU Sun4m & Sun4d & Sun4c IRQ handling
+ *
+ * Copyright (c) 2003-2005 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/main-loop.h"
+#include "hw/irq.h"
+#include "cpu.h"
+#include "trace.h"
+
+void cpu_check_irqs(CPUSPARCState *env)
+{
+CPUState *cs;
+
+/* We should be holding the BQL before we mess with IRQs */
+g_assert(qemu_mutex_iothread_locked());
+
+if (env->pil_in && (env->interrupt_index == 0 ||
+(env->interrupt_index & ~15) == TT_EXTINT)) {
+unsigned int i;
+
+for (i = 15; i > 0; i--) {
+if (env->pil_in & (1 << i)) {
+int old_interrupt = env->interrupt_index;
+
+env->interrupt_index = TT_EXTINT | i;
+if (old_interrupt != env->interrupt_index) {
+cs = env_cpu(env);
+trace_sun4m_cpu_interrupt(i);
+cpu_interrupt(cs, CPU_INTERRUPT_HARD);
+}
+break;
+}
+}
+} else if (!env->pil_in && (env->interrupt_index & ~15) == TT_EXTINT) {
+cs = env_cpu(env);
+trace_sun4m_cpu_reset_interrupt(env->interrupt_index & 15);
+env->interrupt_index = 0;
+cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD);
+}
+}
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index 1a00816d9a8..2edf913d945 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -159,38 +159,6 @@ static void nvram_init(Nvram *nvram, uint8_t *macaddr,
 }
 }
 
-void cpu_check_irqs(CPUSPARCState *env)
-{
-CPUState *cs;
-
-/* We should be holding the BQL before we mess with IRQs */
-g_assert(qemu_mutex_iothread_locked());
-
-if (env->pil_in && (env->interrupt_index == 0 ||
-(env->interrupt_index & ~15) == TT_EXTINT)) {
-unsigned int i;
-
-for (i = 15; i > 0; i--) {
-if (env->pil_in & (1 << i)) {
-int old_interrupt = env->interrupt_index;
-
-env->interrupt_index = TT_EXTINT | i;
-if (old_interrupt != env->interrupt_index) {
-cs = env_cpu(env);
-trace_sun4m_cpu_interrupt(i);
-cpu_interrupt(cs, CPU_INTERRUPT_HARD);
-}
-break;
-}
-}
-} else if (!env->pil_in && (env->interrupt_index & ~15) == TT_EXTINT) {
-cs = env_cpu(env);
-trace_sun4m_cpu_reset_interrupt(env->interrupt_index & 15);
-env->interrupt_index = 0;
-cpu_reset_interrupt(cs, CPU_INTERRUPT_HARD);
-}
-}
-
 static void cpu_kick_irq(SPARCCPU *cpu)
 {
 CPUSPARCState *env = &cpu->env;
diff --git a/hw/sparc/meson.build b/hw/sparc/meson.build
index 19c442c90d9..470159ff659 100644
--- a/hw/sparc/meson.build
+++ b/hw/sparc/meson.build
@@ -1,4 +1,5 @@
 sparc_ss = ss.source_set()
+sparc_ss.add(files('irq.c'))
 sparc_ss.add(when: 'CONFIG_LEON3', if_true: files('leon3.c'))
 sparc_ss.add(when: 'CONFIG_SUN4M', if_true: files('su

[PATCH 0/2] hw/sparc: Kconfig fixes to build with/without the leon3 machine

2021-04-27 Thread Philippe Mathieu-Daudé
This series fixes link failure when building either the leon3
machine or the sun4m ones.

The problem is we have hardware specific code in the architectural
translation code. Move this code to hw/sparc/.

The link failures can be reproduced doing:

  $ echo CONFIG_LEON3=y > default-configs/devices/sparc-softmmu.mak
  $ configure --without-default-devices
  $ ninja qemu-system-sparc
  $ ./qemu-system-sparc -M leon3 -S

or:

  $ echo CONFIG_SUN4M=y > default-configs/devices/sparc-softmmu.mak

Philippe Mathieu-Daudé (2):
  hw/sparc: Allow building the leon3 machine stand-alone
  hw/sparc: Allow building without the leon3 machine

 target/sparc/cpu.h  |  6 
 hw/sparc/irq.c  | 61 +
 hw/sparc/leon3.c| 37 +-
 hw/sparc/sun4m.c| 32 ---
 target/sparc/int32_helper.c | 37 --
 hw/sparc/meson.build|  1 +
 hw/sparc/trace-events   |  2 ++
 target/sparc/trace-events   |  4 ---
 8 files changed, 100 insertions(+), 80 deletions(-)
 create mode 100644 hw/sparc/irq.c

-- 
2.26.3




[RFC PATCH] target/mips: Allow building without Inter-Thread Communication hardware

2021-04-27 Thread Philippe Mathieu-Daudé
The Inter-Thread Communication unit (TYPE_MIPS_ITU) is an optional
device that is only selected by a few machines. However it goes
deep into the translation code, as the MTC0/MTHC0 SAAR helpers
call itc_reconfigure().

When building with no machine selecting the ITU component (which
is implemented in hw/misc/mips_itu.c), we get the following link
failure:

  /usr/bin/ld: target_mips_cp0_helper.c.o: in function `helper_mtc0_saar':
  target/mips/cp0_helper.c:1118: undefined reference to `itc_reconfigure'
  /usr/bin/ld: target_mips_cp0_helper.c.o: in function `helper_mthc0_saar':
  target/mips/cp0_helper.c:1135: undefined reference to `itc_reconfigure'

Fix by adding a stub, built when the ITU isn't selected.

Signed-off-by: Philippe Mathieu-Daudé 
---
RFC because too much Meson machinery to my taste.
But how to deal with such architectural devices else?

To reproduce:

$ echo CONFIG_JAZZ=y > default-configs/devices/mips64el-softmmu.mak
$ echo CONFIG_SEMIHOSTING=y >> default-configs/devices/mips64el-softmmu.mak
$ configure --without-default-devices
$ ninja qemu-system-mips64el
$ ./qemu-system-mips64el -M magnum -S
---
 target/mips/cp0_itu-stub.c | 15 +++
 target/mips/meson.build|  3 +++
 2 files changed, 18 insertions(+)
 create mode 100644 target/mips/cp0_itu-stub.c

diff --git a/target/mips/cp0_itu-stub.c b/target/mips/cp0_itu-stub.c
new file mode 100644
index 000..995b5a09ff8
--- /dev/null
+++ b/target/mips/cp0_itu-stub.c
@@ -0,0 +1,15 @@
+/*
+ * QEMU Inter-Thread Communication Unit emulation stubs
+ *
+ *  Copyright (c) 2021 Philippe Mathieu-Daudé
+ *
+ * SPDX-License-Identifier: LGPL-2.1-or-later
+ */
+#include "qemu/osdep.h"
+#include "cpu.h"
+#include "hw/misc/mips_itu.h"
+
+void itc_reconfigure(MIPSITUState *tag)
+{
+/* nothing? */
+}
diff --git a/target/mips/meson.build b/target/mips/meson.build
index 3b131c4a7f6..a631688fae0 100644
--- a/target/mips/meson.build
+++ b/target/mips/meson.build
@@ -45,6 +45,9 @@
   'cp0_helper.c',
   'mips-semi.c',
 ))
+mips_softmmu_ss.add(when: 'CONFIG_MIPS_ITU', if_false: files(
+  'cp0_itu-stub.c',
+))
 
 mips_ss.add_all(when: 'CONFIG_TCG', if_true: [mips_tcg_ss])
 
-- 
2.26.3




Re: [PATCH] make vfio and DAX cache work together

2021-04-27 Thread Dr. David Alan Gilbert
* Alex Williamson (alex.william...@redhat.com) wrote:
> On Tue, 27 Apr 2021 17:29:37 +0100
> Dev Audsin  wrote:
> 
> > Hi Alex
> > 
> > Based on your comments and thinking a bit, wonder if it makes sense to
> > allow DMA map for the DAX cache but make unexpected mappings to be not
> > fatal. Please let me know your thoughts.
> 
> I think you're still working on the assumption that simply making the
> VM boot is an improvement, it's not.  If there's a risk that a possible
> DMA target for the device cannot be mapped, it's better that the VM
> fail to boot than to expose that risk.  Performance cannot compromise
> correctness.
> 
> We do allow DMA mappings to other device memory regions to fail
> non-fatally with the logic that peer-to-peer DMA is often not trusted
> to work by drivers and therefore support would be probed before
> assuming that it works.  I don't think that same logic applies here.
> 
> Is there something about the definition of this particular region that
> precludes it from being a DMA target for an assigned devices?

It's never really the ram that's used.
This area is really a chunk of VMA that's mmap'd over by (chunks of)
normal files in the underlying exported filesystem.  The actual RAM
block itself is just a placeholder for the VMA, and is normally mapped
PROT_NONE until an actual file is mapped on top of it.
That cache bar is a mapping containing multiple separate file chunk
mappings.

So I guess the problems for VFIO are:
  a) At the start it's unmapped, unaccessible, unallocated ram.
  b) Later it's arbitrary chunks of ondisk files.

[on a bad day, and it's bad even without vfio, someone truncates the
file mapping]

Dave

> Otherwise if it's initially unpopulated, maybe something like the
> RamDiscardManager could be used to insert DMA mappings as the region
> becomes populated.
> 
> Simply disabling mapping to boot with both features together, without
> analyzing how that missing mapping affects their interaction is not
> acceptable.  Thanks,
> 
> Alex
> 
> > On Mon, Apr 26, 2021 at 10:22 PM Alex Williamson
> >  wrote:
> > >
> > > On Mon, 26 Apr 2021 21:50:38 +0100
> > > Dev Audsin  wrote:
> > >  
> > > > Hi Alex and David
> > > >
> > > > @Alex:
> > > >
> > > > Justification on why this region cannot be a DMA target for the device,
> > > >
> > > > virtio-fs with DAX is currently not compatible with NIC Pass through.
> > > > When a SR-IOV VF attaches to a qemu process, vfio will try to pin the
> > > > entire DAX Window but it is empty when the guest boots and will fail.
> > > > A method to make VFIO and DAX to work together is to make vfio skip
> > > > DAX cache.
> > > >
> > > > Currently DAX cache need to be set to 0, for the SR-IOV VF to be
> > > > attached to Kata containers. Enabling both SR-IOV VF and DAX work
> > > > together will potentially improve performance for workloads which are
> > > > I/O and network intensive.  
> > >
> > > Sorry, there's no actual justification described here.  You're enabling
> > > a VM with both features, virtio-fs DAX and VFIO, but there's no
> > > evidence that they "work together" or that your use case is simply
> > > avoiding a scenario where the device might attempt to DMA into the area
> > > with this designation.  With this change, if the device were to attempt
> > > to DMA into this region, it would be blocked by the IOMMU, which might
> > > result in a data loss within the VM.  Justification of this change
> > > needs to prove that this region can never be a DMA target for the
> > > device, not simply that both features can be enabled and we hope that
> > > they don't interact.  Thanks,
> > >
> > > Alex
> > >  
> > 
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




[PATCH] cirrus.yml: Fix the MSYS2 task

2021-04-27 Thread Thomas Huth
The MSYS2 task in the Cirrus-CI is currently failing with error messages
like this:

 warning: database file for 'ucrt64' does not exist (use '-Sy' to download)
 :: Starting core system upgrade...
  there is nothing to do
 :: Starting full system upgrade...
 error: failed to prepare transaction (could not find database)

Seems like it can be fixed by switching to a newer release and by refreshing
the database one more time after changing the /etc/pacman.conf file.

Signed-off-by: Thomas Huth 
---
 Here's a successful run after applying this patch:
 https://cirrus-ci.com/build/4918461810868224

 .cirrus.yml | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index f53c519447..f4bf49b704 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -67,7 +67,7 @@ windows_msys2_task:
 CIRRUS_SHELL: powershell
 MSYS: winsymlinks:nativestrict
 MSYSTEM: MINGW64
-MSYS2_URL: 
https://github.com/msys2/msys2-installer/releases/download/2021-01-05/msys2-base-x86_64-20210105.sfx.exe
+MSYS2_URL: 
https://github.com/msys2/msys2-installer/releases/download/2021-04-19/msys2-base-x86_64-20210419.sfx.exe
 MSYS2_FINGERPRINT: 0
 MSYS2_PACKAGES: "
   diffutils git grep make pkg-config sed
@@ -130,7 +130,7 @@ windows_msys2_task:
 taskkill /F /FI "MODULES eq msys-2.0.dll"
 tasklist
 C:\tools\msys64\usr\bin\bash.exe -lc "mv -f /etc/pacman.conf.pacnew 
/etc/pacman.conf || true"
-C:\tools\msys64\usr\bin\bash.exe -lc "pacman --noconfirm -Suu 
--overwrite=*"
+C:\tools\msys64\usr\bin\bash.exe -lc "pacman --noconfirm -Syuu 
--overwrite=*"
 Write-Output "Core install time taken: 
$((Get-Date).Subtract($start_time))"
 $start_time = Get-Date
 
-- 
2.27.0




Re: [PATCH v2 2/7] virtiofds: Changed allocations of iovec to GLib's functions

2021-04-27 Thread Dr. David Alan Gilbert
* Mahmoud Mandour (ma.mando...@gmail.com) wrote:
> On Tue, Apr 27, 2021 at 1:33 PM Dr. David Alan Gilbert 
> wrote:
> 
> > * Mahmoud Mandour (ma.mando...@gmail.com) wrote:
> > > On Tue, Apr 27, 2021 at 1:01 PM Dr. David Alan Gilbert <
> > dgilb...@redhat.com>
> > > wrote:
> > >
> > > > * Mahmoud Mandour (ma.mando...@gmail.com) wrote:
> > > > > On Tue, Apr 27, 2021 at 12:25 PM Dr. David Alan Gilbert <
> > > > dgilb...@redhat.com>
> > > > > wrote:
> > > > >
> > > > > > * Mahmoud Mandour (ma.mando...@gmail.com) wrote:
> > > > > > > Replaced the calls to malloc()/calloc() and their respective
> > > > > > > calls to free() of iovec structs with GLib's allocation and
> > > > > > > deallocation functions.
> > > > > > >
> > > > > > > Also, in one instance, used g_new0() instead of a calloc() call
> > plus
> > > > > > > a null-checking assertion.
> > > > > > >
> > > > > > > iovec structs were created locally and freed as the function
> > > > > > > ends. Hence, I used g_autofree and removed the respective calls
> > to
> > > > > > > free().
> > > > > > >
> > > > > > > In one instance, a struct fuse_ioctl_iovec pointer is returned
> > from a
> > > > > > > function, namely, fuse_ioctl_iovec_copy. There, I used
> > > > g_steal_pointer()
> > > > > > > in conjunction with g_autofree, this gives the ownership of the
> > > > pointer
> > > > > > > to the calling function and still auto-frees the memory when the
> > > > calling
> > > > > > > function finishes (maintaining the symantics of previous code).
> > > > > > >
> > > > > > > Signed-off-by: Mahmoud Mandour 
> > > > > > > Reviewed-by: Stefan Hajnoczi 
> > > > > > > ---
> > > > > > >  tools/virtiofsd/fuse_lowlevel.c | 19 +++
> > > > > > >  tools/virtiofsd/fuse_virtio.c   |  6 +-
> > > > > > >  2 files changed, 8 insertions(+), 17 deletions(-)
> > > > > > >
> > > > > > > diff --git a/tools/virtiofsd/fuse_lowlevel.c
> > > > > > b/tools/virtiofsd/fuse_lowlevel.c
> > > > > > > index 812cef6ef6..f965299ad9 100644
> > > > > > > --- a/tools/virtiofsd/fuse_lowlevel.c
> > > > > > > +++ b/tools/virtiofsd/fuse_lowlevel.c
> > > > > > > @@ -217,9 +217,9 @@ static int send_reply(fuse_req_t req, int
> > error,
> > > > > > const void *arg,
> > > > > > >  int fuse_reply_iov(fuse_req_t req, const struct iovec *iov, int
> > > > count)
> > > > > > >  {
> > > > > > >  int res;
> > > > > > > -struct iovec *padded_iov;
> > > > > > > +g_autofree struct iovec *padded_iov;
> > > > > > >
> > > > > > > -padded_iov = malloc((count + 1) * sizeof(struct iovec));
> > > > > > > +padded_iov = g_try_new(struct iovec, count + 1);
> > > > > > >  if (padded_iov == NULL) {
> > > > > > >  return fuse_reply_err(req, ENOMEM);
> > > > > > >  }
> > > > > > > @@ -228,7 +228,6 @@ int fuse_reply_iov(fuse_req_t req, const
> > struct
> > > > > > iovec *iov, int count)
> > > > > > >  count++;
> > > > > > >
> > > > > > >  res = send_reply_iov(req, 0, padded_iov, count);
> > > > > > > -free(padded_iov);
> > > > > > >
> > > > > > >  return res;
> > > > > > >  }
> > > > > >
> > > > > > OK.
> > > > > >
> > > > > > > @@ -565,10 +564,10 @@ int fuse_reply_bmap(fuse_req_t req,
> > uint64_t
> > > > idx)
> > > > > > >  static struct fuse_ioctl_iovec *fuse_ioctl_iovec_copy(const
> > struct
> > > > > > iovec *iov,
> > > > > > >size_t
> > count)
> > > > > > >  {
> > > > > > > -struct fuse_ioctl_iovec *fiov;
> > > > > > > +g_autofree struct fuse_ioctl_iovec *fiov;
> > > > > > >  size_t i;
> > > > > > >
> > > > > > > -fiov = malloc(sizeof(fiov[0]) * count);
> > > > > > > +fiov = g_try_new(fuse_ioctl_iovec, count);
> > > > > > >  if (!fiov) {
> > > > > > >  return NULL;
> > > > > > >  }
> > > > > > > @@ -578,7 +577,7 @@ static struct fuse_ioctl_iovec
> > > > > > *fuse_ioctl_iovec_copy(const struct iovec *iov,
> > > > > > >  fiov[i].len = iov[i].iov_len;
> > > > > > >  }
> > > > > > >
> > > > > > > -return fiov;
> > > > > > > +return g_steal_pointer(&fiov);
> > > > > > >  }
> > > > > >
> > > > > > This is OK, but doesn't gain anything - marking it as
> > g_autofree'ing
> > > > and
> > > > > > always stealing is no benefit.
> > > > > >
> > > > > > >
> > > > > > >  int fuse_reply_ioctl_retry(fuse_req_t req, const struct iovec
> > > > *in_iov,
> > > > > > > @@ -629,9 +628,6 @@ int fuse_reply_ioctl_retry(fuse_req_t req,
> > const
> > > > > > struct iovec *in_iov,
> > > > > > >
> > > > > > >  res = send_reply_iov(req, 0, iov, count);
> > > > > > >  out:
> > > > > > > -free(in_fiov);
> > > > > > > -free(out_fiov);
> > > > > > > -
> > > > > >
> > > > > > I don't think you can do that - I think you're relying here on the
> > > > > > g_autofree from fuse_ioclt_iovec_copy - but my understanding is
> > that
> > > > > > doesn't work; g_autofree is scoped, so it's designed to free at
> > the end
> > > > > > of fuse_ioctl_iovec_copy, fuse_reply_ioctl_retry doe

Re: [PATCH RFC C0/2] support allocation-map for block-dirty-bitmap-merge

2021-04-27 Thread John Snow

On 4/27/21 7:11 AM, Vladimir Sementsov-Ogievskiy wrote:

Hi all!

It's a simpler alternative for
"[PATCH v4 0/5] block: add block-dirty-bitmap-populate job"
   <20200902181831.2570048-1-ebl...@redhat.com>
   https://lists.gnu.org/archive/html/qemu-devel/2020-09/msg00978.html
   https://patchew.org/QEMU/20200902181831.2570048-1-ebl...@redhat.com/

Since we have "coroutine: true" feature for qmp commands, I think,
maybe we can merge allocation status to bitmap without bothering with
new block-job?

It's an RFC:

1. Main question: is it OK as a simple blocking command, even in a
coroutine mode. It's a lot simpler, and it can be simply used in a
transaction with other bitmap commands.



Hm, possibly... I did not follow the discussion of coroutine QMP 
commands closely to know what the qualifying criteria to use them are.


(Any wisdom for me here, Markus?)


2. Transaction support is not here now. Will add in future version, if
general approach is OK.



That should be alright, I think. It means that the operation needs to 
succeed before the transaction returns success, though.


Depending on what else is in the transaction, do we run the risk of a 
deadlock if we need to wait for a coroutine to finish?



3. I just do bdrv_co_enter() / bdrv_co_leave() like it is done in the
only coroutine qmp command - block_resize(). I'm not sure how much is it
correct.



See above concern!


4. I don't do any "drain". I think it's not needed, as intended usage
is to merge block-status to _active_ bitmap. So all concurrent
operations will just increase dirtyness of the bitmap and it is OK.



That sounds fine for individual usage, but I can't convince myself it's 
safe for transactions.



5. Probably we still need to create some BdrvChild to avoid node resize
during the loop of block-status querying.



I'm less sure that it's OK to cause temporary graph changes during the 
course of a blocking QMP function... but maybe that's OK?


Peter Krempa is the expert to consult on that one.


6. Test is mostly copied from parallels-read-bitmap, I'll refactor it in
next version to avoid copy-paste.

7. Probably patch 01 is better be split into 2-3 patches.

Vladimir Sementsov-Ogievskiy (2):
   qapi: block-dirty-bitmap-merge: support allocation maps
   iotests: add allocation-map-to-bitmap

  qapi/block-core.json  | 31 -
  include/block/block_int.h |  4 ++
  block/dirty-bitmap.c  | 42 
  block/monitor/bitmap-qmp-cmds.c   | 55 +---
  .../tests/allocation-map-to-bitmap| 64 +++
  .../tests/allocation-map-to-bitmap.out|  9 +++
  6 files changed, 195 insertions(+), 10 deletions(-)
  create mode 100755 tests/qemu-iotests/tests/allocation-map-to-bitmap
  create mode 100644 tests/qemu-iotests/tests/allocation-map-to-bitmap.out






Re: [PATCH v2 2/7] virtiofds: Changed allocations of iovec to GLib's functions

2021-04-27 Thread Mahmoud Mandour
On Tue, Apr 27, 2021 at 1:33 PM Dr. David Alan Gilbert 
wrote:

> * Mahmoud Mandour (ma.mando...@gmail.com) wrote:
> > On Tue, Apr 27, 2021 at 1:01 PM Dr. David Alan Gilbert <
> dgilb...@redhat.com>
> > wrote:
> >
> > > * Mahmoud Mandour (ma.mando...@gmail.com) wrote:
> > > > On Tue, Apr 27, 2021 at 12:25 PM Dr. David Alan Gilbert <
> > > dgilb...@redhat.com>
> > > > wrote:
> > > >
> > > > > * Mahmoud Mandour (ma.mando...@gmail.com) wrote:
> > > > > > Replaced the calls to malloc()/calloc() and their respective
> > > > > > calls to free() of iovec structs with GLib's allocation and
> > > > > > deallocation functions.
> > > > > >
> > > > > > Also, in one instance, used g_new0() instead of a calloc() call
> plus
> > > > > > a null-checking assertion.
> > > > > >
> > > > > > iovec structs were created locally and freed as the function
> > > > > > ends. Hence, I used g_autofree and removed the respective calls
> to
> > > > > > free().
> > > > > >
> > > > > > In one instance, a struct fuse_ioctl_iovec pointer is returned
> from a
> > > > > > function, namely, fuse_ioctl_iovec_copy. There, I used
> > > g_steal_pointer()
> > > > > > in conjunction with g_autofree, this gives the ownership of the
> > > pointer
> > > > > > to the calling function and still auto-frees the memory when the
> > > calling
> > > > > > function finishes (maintaining the symantics of previous code).
> > > > > >
> > > > > > Signed-off-by: Mahmoud Mandour 
> > > > > > Reviewed-by: Stefan Hajnoczi 
> > > > > > ---
> > > > > >  tools/virtiofsd/fuse_lowlevel.c | 19 +++
> > > > > >  tools/virtiofsd/fuse_virtio.c   |  6 +-
> > > > > >  2 files changed, 8 insertions(+), 17 deletions(-)
> > > > > >
> > > > > > diff --git a/tools/virtiofsd/fuse_lowlevel.c
> > > > > b/tools/virtiofsd/fuse_lowlevel.c
> > > > > > index 812cef6ef6..f965299ad9 100644
> > > > > > --- a/tools/virtiofsd/fuse_lowlevel.c
> > > > > > +++ b/tools/virtiofsd/fuse_lowlevel.c
> > > > > > @@ -217,9 +217,9 @@ static int send_reply(fuse_req_t req, int
> error,
> > > > > const void *arg,
> > > > > >  int fuse_reply_iov(fuse_req_t req, const struct iovec *iov, int
> > > count)
> > > > > >  {
> > > > > >  int res;
> > > > > > -struct iovec *padded_iov;
> > > > > > +g_autofree struct iovec *padded_iov;
> > > > > >
> > > > > > -padded_iov = malloc((count + 1) * sizeof(struct iovec));
> > > > > > +padded_iov = g_try_new(struct iovec, count + 1);
> > > > > >  if (padded_iov == NULL) {
> > > > > >  return fuse_reply_err(req, ENOMEM);
> > > > > >  }
> > > > > > @@ -228,7 +228,6 @@ int fuse_reply_iov(fuse_req_t req, const
> struct
> > > > > iovec *iov, int count)
> > > > > >  count++;
> > > > > >
> > > > > >  res = send_reply_iov(req, 0, padded_iov, count);
> > > > > > -free(padded_iov);
> > > > > >
> > > > > >  return res;
> > > > > >  }
> > > > >
> > > > > OK.
> > > > >
> > > > > > @@ -565,10 +564,10 @@ int fuse_reply_bmap(fuse_req_t req,
> uint64_t
> > > idx)
> > > > > >  static struct fuse_ioctl_iovec *fuse_ioctl_iovec_copy(const
> struct
> > > > > iovec *iov,
> > > > > >size_t
> count)
> > > > > >  {
> > > > > > -struct fuse_ioctl_iovec *fiov;
> > > > > > +g_autofree struct fuse_ioctl_iovec *fiov;
> > > > > >  size_t i;
> > > > > >
> > > > > > -fiov = malloc(sizeof(fiov[0]) * count);
> > > > > > +fiov = g_try_new(fuse_ioctl_iovec, count);
> > > > > >  if (!fiov) {
> > > > > >  return NULL;
> > > > > >  }
> > > > > > @@ -578,7 +577,7 @@ static struct fuse_ioctl_iovec
> > > > > *fuse_ioctl_iovec_copy(const struct iovec *iov,
> > > > > >  fiov[i].len = iov[i].iov_len;
> > > > > >  }
> > > > > >
> > > > > > -return fiov;
> > > > > > +return g_steal_pointer(&fiov);
> > > > > >  }
> > > > >
> > > > > This is OK, but doesn't gain anything - marking it as
> g_autofree'ing
> > > and
> > > > > always stealing is no benefit.
> > > > >
> > > > > >
> > > > > >  int fuse_reply_ioctl_retry(fuse_req_t req, const struct iovec
> > > *in_iov,
> > > > > > @@ -629,9 +628,6 @@ int fuse_reply_ioctl_retry(fuse_req_t req,
> const
> > > > > struct iovec *in_iov,
> > > > > >
> > > > > >  res = send_reply_iov(req, 0, iov, count);
> > > > > >  out:
> > > > > > -free(in_fiov);
> > > > > > -free(out_fiov);
> > > > > > -
> > > > >
> > > > > I don't think you can do that - I think you're relying here on the
> > > > > g_autofree from fuse_ioclt_iovec_copy - but my understanding is
> that
> > > > > doesn't work; g_autofree is scoped, so it's designed to free at
> the end
> > > > > of fuse_ioctl_iovec_copy, fuse_reply_ioctl_retry doesn't know that
> the
> > > > > ion_fiov were allocated that way, so it won't get autocleaned up.
> > > > >
> > > > >
> > > > In GLib's documentation, it is clarified (w.r.t. g_autoptr but I
> think
> > > > similar logic applies to g_autofree)
> > > > that g_steal_pointer() "This can be very useful when c

Re: [PATCH] make vfio and DAX cache work together

2021-04-27 Thread Alex Williamson
On Tue, 27 Apr 2021 17:29:37 +0100
Dev Audsin  wrote:

> Hi Alex
> 
> Based on your comments and thinking a bit, wonder if it makes sense to
> allow DMA map for the DAX cache but make unexpected mappings to be not
> fatal. Please let me know your thoughts.

I think you're still working on the assumption that simply making the
VM boot is an improvement, it's not.  If there's a risk that a possible
DMA target for the device cannot be mapped, it's better that the VM
fail to boot than to expose that risk.  Performance cannot compromise
correctness.

We do allow DMA mappings to other device memory regions to fail
non-fatally with the logic that peer-to-peer DMA is often not trusted
to work by drivers and therefore support would be probed before
assuming that it works.  I don't think that same logic applies here.

Is there something about the definition of this particular region that
precludes it from being a DMA target for an assigned devices?

Otherwise if it's initially unpopulated, maybe something like the
RamDiscardManager could be used to insert DMA mappings as the region
becomes populated.

Simply disabling mapping to boot with both features together, without
analyzing how that missing mapping affects their interaction is not
acceptable.  Thanks,

Alex

> On Mon, Apr 26, 2021 at 10:22 PM Alex Williamson
>  wrote:
> >
> > On Mon, 26 Apr 2021 21:50:38 +0100
> > Dev Audsin  wrote:
> >  
> > > Hi Alex and David
> > >
> > > @Alex:
> > >
> > > Justification on why this region cannot be a DMA target for the device,
> > >
> > > virtio-fs with DAX is currently not compatible with NIC Pass through.
> > > When a SR-IOV VF attaches to a qemu process, vfio will try to pin the
> > > entire DAX Window but it is empty when the guest boots and will fail.
> > > A method to make VFIO and DAX to work together is to make vfio skip
> > > DAX cache.
> > >
> > > Currently DAX cache need to be set to 0, for the SR-IOV VF to be
> > > attached to Kata containers. Enabling both SR-IOV VF and DAX work
> > > together will potentially improve performance for workloads which are
> > > I/O and network intensive.  
> >
> > Sorry, there's no actual justification described here.  You're enabling
> > a VM with both features, virtio-fs DAX and VFIO, but there's no
> > evidence that they "work together" or that your use case is simply
> > avoiding a scenario where the device might attempt to DMA into the area
> > with this designation.  With this change, if the device were to attempt
> > to DMA into this region, it would be blocked by the IOMMU, which might
> > result in a data loss within the VM.  Justification of this change
> > needs to prove that this region can never be a DMA target for the
> > device, not simply that both features can be enabled and we hope that
> > they don't interact.  Thanks,
> >
> > Alex
> >  
> 




[PATCH v3 2/7] virtiofsd: Changed allocations of iovec to GLib's functions

2021-04-27 Thread Mahmoud Mandour
Replaced the calls to malloc()/calloc() and their respective
calls to free() of iovec structs with GLib's allocation and
deallocation functions and used g_autofree when appropriate.

Replaced the allocation of in_sg_cpy to g_new() instead of a call
to calloc() and a null-checking assertion. Not g_new0()
because the buffer is immediately overwritten using memcpy.

Signed-off-by: Mahmoud Mandour 
---
v2 -> v3:
* Removed a wrongful combination of g_autofree and g_steel_pointer().
* Removed some goto paths that IMHO were not so useful any more.
* In v2, I allocated in_sg_cpy through g_new0(). This patch, I use
  g_new() because the buffer is memcpy'd into right away so no need
  to zero-initialize.
* Moved the declaration of in_sg_cpy to the top of the function
  to match QEMU's style guidelines. 

 tools/virtiofsd/fuse_lowlevel.c | 31 ---
 tools/virtiofsd/fuse_virtio.c   |  8 +++-
 2 files changed, 15 insertions(+), 24 deletions(-)

diff --git a/tools/virtiofsd/fuse_lowlevel.c b/tools/virtiofsd/fuse_lowlevel.c
index c8bea246ab..7fe2cef1eb 100644
--- a/tools/virtiofsd/fuse_lowlevel.c
+++ b/tools/virtiofsd/fuse_lowlevel.c
@@ -217,9 +217,9 @@ static int send_reply(fuse_req_t req, int error, const void 
*arg,
 int fuse_reply_iov(fuse_req_t req, const struct iovec *iov, int count)
 {
 int res;
-struct iovec *padded_iov;
+g_autofree struct iovec *padded_iov = NULL;
 
-padded_iov = malloc((count + 1) * sizeof(struct iovec));
+padded_iov = g_try_new(struct iovec, count + 1);
 if (padded_iov == NULL) {
 return fuse_reply_err(req, ENOMEM);
 }
@@ -228,7 +228,6 @@ int fuse_reply_iov(fuse_req_t req, const struct iovec *iov, 
int count)
 count++;
 
 res = send_reply_iov(req, 0, padded_iov, count);
-free(padded_iov);
 
 return res;
 }
@@ -568,7 +567,7 @@ static struct fuse_ioctl_iovec *fuse_ioctl_iovec_copy(const 
struct iovec *iov,
 struct fuse_ioctl_iovec *fiov;
 size_t i;
 
-fiov = malloc(sizeof(fiov[0]) * count);
+fiov = g_try_new(struct fuse_ioctl_iovec, count);
 if (!fiov) {
 return NULL;
 }
@@ -586,8 +585,8 @@ int fuse_reply_ioctl_retry(fuse_req_t req, const struct 
iovec *in_iov,
size_t out_count)
 {
 struct fuse_ioctl_out arg;
-struct fuse_ioctl_iovec *in_fiov = NULL;
-struct fuse_ioctl_iovec *out_fiov = NULL;
+g_autofree struct fuse_ioctl_iovec *in_fiov = NULL;
+g_autofree struct fuse_ioctl_iovec *out_fiov = NULL;
 struct iovec iov[4];
 size_t count = 1;
 int res;
@@ -603,13 +602,14 @@ int fuse_reply_ioctl_retry(fuse_req_t req, const struct 
iovec *in_iov,
 /* Can't handle non-compat 64bit ioctls on 32bit */
 if (sizeof(void *) == 4 && req->ioctl_64bit) {
 res = fuse_reply_err(req, EINVAL);
-goto out;
+return res;
 }
 
 if (in_count) {
 in_fiov = fuse_ioctl_iovec_copy(in_iov, in_count);
 if (!in_fiov) {
-goto enomem;
+res = fuse_reply_err(req, ENOMEM);
+return res;
 }
 
 iov[count].iov_base = (void *)in_fiov;
@@ -619,7 +619,8 @@ int fuse_reply_ioctl_retry(fuse_req_t req, const struct 
iovec *in_iov,
 if (out_count) {
 out_fiov = fuse_ioctl_iovec_copy(out_iov, out_count);
 if (!out_fiov) {
-goto enomem;
+res = fuse_reply_err(req, ENOMEM);
+return res;
 }
 
 iov[count].iov_base = (void *)out_fiov;
@@ -628,15 +629,8 @@ int fuse_reply_ioctl_retry(fuse_req_t req, const struct 
iovec *in_iov,
 }
 
 res = send_reply_iov(req, 0, iov, count);
-out:
-free(in_fiov);
-free(out_fiov);
 
 return res;
-
-enomem:
-res = fuse_reply_err(req, ENOMEM);
-goto out;
 }
 
 int fuse_reply_ioctl(fuse_req_t req, int result, const void *buf, size_t size)
@@ -663,11 +657,11 @@ int fuse_reply_ioctl(fuse_req_t req, int result, const 
void *buf, size_t size)
 int fuse_reply_ioctl_iov(fuse_req_t req, int result, const struct iovec *iov,
  int count)
 {
-struct iovec *padded_iov;
+g_autofree struct iovec *padded_iov = NULL;
 struct fuse_ioctl_out arg;
 int res;
 
-padded_iov = malloc((count + 2) * sizeof(struct iovec));
+padded_iov = g_try_new(struct iovec, count + 2);
 if (padded_iov == NULL) {
 return fuse_reply_err(req, ENOMEM);
 }
@@ -680,7 +674,6 @@ int fuse_reply_ioctl_iov(fuse_req_t req, int result, const 
struct iovec *iov,
 memcpy(&padded_iov[2], iov, count * sizeof(struct iovec));
 
 res = send_reply_iov(req, 0, padded_iov, count + 2);
-free(padded_iov);
 
 return res;
 }
diff --git a/tools/virtiofsd/fuse_virtio.c b/tools/virtiofsd/fuse_virtio.c
index 9e437618fb..9b00687cb0 100644
--- a/tools/virtiofsd/fuse_virtio.c
+++ b/tools/virtiofsd/fuse_virtio.c
@@ -295,6 +295,8 @@ int virtio_send_data_iov(struct fuse_session *se, struct 
fuse_chan *ch,
 VuVirtq

Re: [PATCH] floppy: remove unused function fdctrl_format_sector

2021-04-27 Thread John Snow

On 3/14/21 3:53 AM, Hervé Poussineau wrote:

Le 12/03/2021 à 07:45, John Snow a écrit :

On 1/8/21 6:01 PM, Alexander Bulekov wrote:

fdctrl_format_sector was added in
baca51faff ("updated floppy driver: formatting code, disk geometry 
auto detect (Jocelyn Mayer)")


The single callsite is guarded by a check:
fdctrl->data_state & FD_STATE_FORMAT

However, the only place where the FD_STATE_FORMAT flag is set (in
fdctrl_handle_format_track) is closely followed by the same flag being
unset, with no possibility to call fdctrl_format_sector in between.



Hm, was this code *ever* used? It's hard to tell when we go back into 
the old SVN history.


Does this mean that fdctrl_handle_format_track is also basically an 
incomplete stub method?


I'm in favor of deleting bitrotted code, but I wonder if we should 
take a bigger bite.


--js


The fdctrl_format_sector has been added in SVN revision 671 
(baca51faff03df59386c95d9478ede18b5be5ec8), along with 
FD_STATE_FORMAT/FD_FORMAT_CMD.
As with current code, the only place where the FD_STATE_FORMAT flag was 
set (in fdctrl_handle_format_track) is closely followed by the same flag 
being unset, with no possibility to call fdctrl_format_sector in between.


I can however see the following comment:
    /* Bochs BIOS is buggy and don't send format informations
     * for each sector. So, pretend all's done right now...
     */
    fdctrl->data_state &= ~FD_STATE_FORMAT;

which was changed in SVN revision 2295 
(b92090309e5ff7154e4c131438ee2d540e233955) to:

    /* TODO: implement format using DMA expected by the Bochs BIOS
     * and Linux fdformat (read 3 bytes per sector via DMA and fill
     * the sector with the specified fill byte
     */

This probably means that code may have worked without DMA (to be 
confirmed), but was disabled since its introduction due to a problem 
with Bochs BIOS.

Later, fdformat was also tested and not working.

Since then, lots of work has also been done in DMA handling. I 
especially think at bb8f32c0318cb6c6e13e09ec0f35e21eff246413, which 
fixed a similar problem with floppy drives on IBM 40p machine.

What happens when this flag unsetting is removed? Does fdformat now works?

I think that we should either fix the code, or remove more code 
(everything related to fdctrl_format_sector, FD_STATE_FORMAT, 
FD_FORMAT_CMD, maybe even fdctrl_handle_format_track).


Regards,

Hervé



Alex, do you want to respin this following Hervé's suggestion for 
additional deletions?


I doubt anyone has the time or interest to actually FIX this code, so we 
may as well remove misleading code.


--js




[PATCH v2] fdc: fix floppy boot for Red Hat Linux 5.2

2021-04-27 Thread John Snow
The image size indicates it's an 81 track floppy disk image, which we
don't have a listing for in the geometry table. When you force the drive
type to 1.44MB, it guesses the reasonably close 18/80. When the drive
type is allowed to auto-detect or set to 2.88, it guesses a very
incorrect geometry.

auto, 144 and 288 drive types get the right geometry with the new entry
in the table.

Reported-by: Michael Tokarev 
Signed-off-by: John Snow 
Reviewed-by: Thomas Huth 

---

V2: I didn't actually stage this, so this is just a re-send to get a
fresh Message-ID to reference in the PR. Added Thomas's R-B.

 hw/block/fdc.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/block/fdc.c b/hw/block/fdc.c
index a825c2acbae..0f0c716d878 100644
--- a/hw/block/fdc.c
+++ b/hw/block/fdc.c
@@ -122,6 +122,7 @@ static const FDFormat fd_formats[] = {
 /* First entry is default format */
 /* 1.44 MB 3"1/2 floppy disks */
 { FLOPPY_DRIVE_TYPE_144, 18, 80, 1, FDRIVE_RATE_500K, }, /* 3.5" 2880 */
+{ FLOPPY_DRIVE_TYPE_144, 18, 81, 1, FDRIVE_RATE_500K, },
 { FLOPPY_DRIVE_TYPE_144, 20, 80, 1, FDRIVE_RATE_500K, }, /* 3.5" 3200 */
 { FLOPPY_DRIVE_TYPE_144, 21, 80, 1, FDRIVE_RATE_500K, },
 { FLOPPY_DRIVE_TYPE_144, 21, 82, 1, FDRIVE_RATE_500K, },
-- 
2.30.2




Re: [PATCH 2/4] hw/block/fdc: Declare shared prototypes in fdc-internal.h

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/27/21 6:53 PM, John Snow wrote:
> On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:
>> We want to extract ISA/SysBus code from the generic fdc.c file.
>> First, declare the prototypes we will access from the new units
>> into a new local header: "fdc-internal.h".
>>
>> Signed-off-by: Philippe Mathieu-Daudé 
>> ---
>>   hw/block/fdc-internal.h | 155 
>>   hw/block/fdc.c  | 126 +++-
>>   MAINTAINERS |   1 +
>>   3 files changed, 164 insertions(+), 118 deletions(-)
>>   create mode 100644 hw/block/fdc-internal.h
>>
> 
> With our policy of not including osdep.h in headers, it's hard to verify
> that this header is otherwise self-sufficient.
> 
> 
> I think the only thing it needs (not in osdep.h) happens to be MAX_FD. I
> added osdep.h just to test:
> 
> jsnow@scv ~/s/q/h/block (review)> gcc -I../../include/ -I../../bin/git
> -I/usr/lib64/glib-2.0/include -I/usr/include/glib-2.0 -c -o
> test_header.bin fdc-internal.h
> fdc-internal.h:134:19: error: ‘MAX_FD’ undeclared here (not in a function)
>   134 | FDrive drives[MAX_FD];
>   |   ^~
> 
> 
> Should we include the fdc header from the internal one?

Yes, good catch, will do.




Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread John Snow

On 4/16/21 8:52 AM, Thomas Huth wrote:

QEMU currently crashes when the user tries to do something like:

  qemu-system-x86_64 -M x-remote -device piix3-ide

This happens because the "isabus" variable is not initialized with
the x-remote machine yet. Add a proper check for this condition
and propagate the error to the caller, so we can fail there gracefully.

Signed-off-by: Thomas Huth 


Seems OK to me for now. I will trust that this won't get in the way of 
any deeper refactors Phil has planned, since this just adds error 
pathways to avoid something already broken, and doesn't change anything 
else.


I'm OK with that.

Reviewed-by: John Snow 

Tentatively staged, pending further discussion.


---
  hw/ide/ioport.c   | 16 ++--
  hw/ide/piix.c | 22 +-
  hw/isa/isa-bus.c  | 14 ++
  include/hw/ide/internal.h |  2 +-
  include/hw/isa/isa.h  | 13 -
  5 files changed, 46 insertions(+), 21 deletions(-)





diff --git a/hw/ide/ioport.c b/hw/ide/ioport.c
index b613ff3bba..e6caa537fa 100644
--- a/hw/ide/ioport.c
+++ b/hw/ide/ioport.c
@@ -50,15 +50,19 @@ static const MemoryRegionPortio ide_portio2_list[] = {
  PORTIO_END_OF_LIST(),
  };
  
-void ide_init_ioport(IDEBus *bus, ISADevice *dev, int iobase, int iobase2)

+int ide_init_ioport(IDEBus *bus, ISADevice *dev, int iobase, int iobase2)
  {
+int ret;
+
  /* ??? Assume only ISA and PCI configurations, and that the PCI-ISA
 bridge has been setup properly to always register with ISA.  */
-isa_register_portio_list(dev, &bus->portio_list,
- iobase, ide_portio_list, bus, "ide");
+ret = isa_register_portio_list(dev, &bus->portio_list,
+   iobase, ide_portio_list, bus, "ide");
  
-if (iobase2) {

-isa_register_portio_list(dev, &bus->portio2_list,
- iobase2, ide_portio2_list, bus, "ide");
+if (ret == 0 && iobase2) {
+ret = isa_register_portio_list(dev, &bus->portio2_list,
+   iobase2, ide_portio2_list, bus, "ide");
  }
+
+return ret;
  }



Little funky instead of just checking and returning after the first 
portio list registration, you could leave a little more git blame intact 
by not doing this, but...


...Then again, I just acked a ton of stuff Phil moved around, so, 
whatever O:-)





Re: [PATCH 0/4] hw/block/fdc: Allow Kconfig-selecting ISA bus/SysBus floppy controllers

2021-04-27 Thread John Snow

On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:

Hi,

The floppy disc controllers pulls in irrelevant devices (sysbus in
an ISA-only machine, ISA bus + isa devices on a sysbus-only machine).

This series clean that by extracting each device in its own file,
adding the corresponding Kconfig symbols: FDC_ISA and FDC_SYSBUS.

Regards,

Phil.



Lightly reviewed and I'm fine with it overall; but want a quick ~5min 
up-or-down by Hervé and/or Mark (To make sure this doesn't break any 
non-x86 system they may care about), and a quick nod from Paolo for 
KConfig changes would be nice.


I'll be waiting on a reply to my question on patch 01 before staging.

--js


Philippe Mathieu-Daudé (4):
   hw/block/fdc: Replace disabled fprintf() by trace event
   hw/block/fdc: Declare shared prototypes in fdc-internal.h
   hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c
   hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c

  hw/block/fdc-internal.h | 155 ++
  hw/block/fdc-isa.c  | 313 +
  hw/block/fdc-sysbus.c   | 252 +
  hw/block/fdc.c  | 608 +---
  MAINTAINERS |   3 +
  hw/block/Kconfig|   8 +
  hw/block/meson.build|   2 +
  hw/block/trace-events   |   3 +
  hw/i386/Kconfig |   2 +-
  hw/isa/Kconfig  |   6 +-
  hw/mips/Kconfig |   2 +-
  hw/sparc/Kconfig|   2 +-
  hw/sparc64/Kconfig  |   2 +-
  13 files changed, 751 insertions(+), 607 deletions(-)
  create mode 100644 hw/block/fdc-internal.h
  create mode 100644 hw/block/fdc-isa.c
  create mode 100644 hw/block/fdc-sysbus.c






Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread John Snow

On 4/27/21 1:54 PM, Philippe Mathieu-Daudé wrote:

On 4/27/21 7:16 PM, John Snow wrote:

On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:

I suggest fixing this at the qdev level. Make piix3-ide have a
sub-device that inherits from ISA_DEVICE so it can only be instantiated
when there's an ISA bus.

Stefan


My qdev knowledge is shaky. Does this imply that you agree with the
direction of Thomas's patch, or do you just mean to disagree with Phil
on his preferred course of action?


My understanding is a disagreement to both, with a 3rd direction :)

I agree with Stefan direction but I'm not sure (yet) that a sub-device
is the best (long-term) solution. I guess there is a design issue with
this device, and would like to understanding it first.

IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
only allow a single parent. Multiple QOM inheritance is resolved as
interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
So he suggests to embed an IDE device within the PCI piix3-ide device.

My view is the PIIX is a chipset that share stuffs between components,
and the IDE bus belongs to the chipset PCI root (or eventually the
PCI-ISA bridge, function #0). The IDE function would use the IDE bus
from its root parent as a linked property.
My problem is currently this device is user-creatable as a Frankenstein
single PCI function, out of its chipset. I'm not sure yet this is a
dead end or I could work something out.

Regards,

Phil.



It sounds complicated. In the meantime, I think I am favor of taking 
Thomas's patch because it merely adds some error routing to allow us to 
avoid a crash. The core organizational issues of the IDE device(s) will 
remain and can be solved later as needed.


Do you agree?

--js




Re: [PATCH 4/4] hw/block/fdc: Extract SysBus floppy controllers to fdc-sysbus.c

2021-04-27 Thread John Snow

On 4/15/21 6:23 AM, Philippe Mathieu-Daudé wrote:

Some machines use floppy controllers via the SysBus interface,
and don't need to pull in all the SysBus code.
Extract the SysBus specific code to a new unit: fdc-sysbus.c,
and add a new Kconfig symbol: "FDC_SYSBUS".

Signed-off-by: Philippe Mathieu-Daudé 


LGTM, but again you'll want someone to review the Kconfig changes. It 
looks reasonable to me at a glance, but I just don't know what I don't 
know there.


I'm trusting you somewhat that you've audited the proper dependencies 
for each subsystem and that these have been accurately described via 
Kconfig -- my knowledge of non-x86 platforms is a bit meager, so I am 
relying on CI to tell me if this breaks anything I care about.


Would love to get an ACK from Mark Cave-Ayland and Hervé Poussineau if 
possible, but if they're not available to take a quick peek, we'll try 
to get this in closer to the beginning of the dev window to maximize 
problem-finding time.


Reviewed-by: John Snow 


---
  hw/block/fdc-sysbus.c | 252 ++
  hw/block/fdc.c| 220 
  MAINTAINERS   |   1 +
  hw/block/Kconfig  |   4 +
  hw/block/meson.build  |   1 +
  hw/block/trace-events |   2 +
  hw/mips/Kconfig   |   2 +-
  hw/sparc/Kconfig  |   2 +-
  8 files changed, 262 insertions(+), 222 deletions(-)
  create mode 100644 hw/block/fdc-sysbus.c

diff --git a/hw/block/fdc-sysbus.c b/hw/block/fdc-sysbus.c
new file mode 100644
index 000..71755fd6ae4
--- /dev/null
+++ b/hw/block/fdc-sysbus.c
@@ -0,0 +1,252 @@
+/*
+ * QEMU Floppy disk emulator (Intel 82078)
+ *
+ * Copyright (c) 2003, 2007 Jocelyn Mayer
+ * Copyright (c) 2008 Hervé Poussineau
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qom/object.h"
+#include "hw/sysbus.h"
+#include "hw/block/fdc.h"
+#include "migration/vmstate.h"
+#include "fdc-internal.h"
+#include "trace.h"
+
+#define TYPE_SYSBUS_FDC "base-sysbus-fdc"
+typedef struct FDCtrlSysBusClass FDCtrlSysBusClass;
+typedef struct FDCtrlSysBus FDCtrlSysBus;
+DECLARE_OBJ_CHECKERS(FDCtrlSysBus, FDCtrlSysBusClass,
+ SYSBUS_FDC, TYPE_SYSBUS_FDC)
+
+struct FDCtrlSysBusClass {
+/*< private >*/
+SysBusDeviceClass parent_class;
+/*< public >*/
+
+bool use_strict_io;
+};
+
+struct FDCtrlSysBus {
+/*< private >*/
+SysBusDevice parent_obj;
+/*< public >*/
+
+struct FDCtrl state;
+};
+
+static uint64_t fdctrl_read_mem(void *opaque, hwaddr reg, unsigned ize)
+{
+return fdctrl_read(opaque, (uint32_t)reg);
+}
+
+static void fdctrl_write_mem(void *opaque, hwaddr reg,
+ uint64_t value, unsigned size)
+{
+fdctrl_write(opaque, (uint32_t)reg, value);
+}
+
+static const MemoryRegionOps fdctrl_mem_ops = {
+.read = fdctrl_read_mem,
+.write = fdctrl_write_mem,
+.endianness = DEVICE_NATIVE_ENDIAN,
+};
+
+static const MemoryRegionOps fdctrl_mem_strict_ops = {
+.read = fdctrl_read_mem,
+.write = fdctrl_write_mem,
+.endianness = DEVICE_NATIVE_ENDIAN,
+.valid = {
+.min_access_size = 1,
+.max_access_size = 1,
+},
+};
+
+static void fdctrl_external_reset_sysbus(DeviceState *d)
+{
+FDCtrlSysBus *sys = SYSBUS_FDC(d);
+FDCtrl *s = &sys->state;
+
+fdctrl_reset(s, 0);
+}
+
+static void fdctrl_handle_tc(void *opaque, int irq, int level)
+{
+trace_fdctrl_tc_pulse(level);
+}
+
+void fdctrl_init_sysbus(qemu_irq irq, int dma_chann,
+hwaddr mmio_base, DriveInfo **fds)
+{
+FDCtrl *fdctrl;
+DeviceState *dev;
+SysBusDevice *sbd;
+FDCtrlSysBus *sys;
+
+dev = qdev_new("sysbus-fdc");
+sys = SYSBUS_FDC(dev);
+fdctrl = &sys->state;
+fdctrl->dma_chann = dma_chann; /* FIXME */
+sbd = SYS_BUS_DEVICE(dev);
+sysbus_realize_and_unref(sbd, &error_fatal);
+sysbus_

Re: [PATCH v11 5/6] KVM: arm64: ioctl to fetch/store tags in a guest

2021-04-27 Thread Catalin Marinas
On Fri, Apr 16, 2021 at 04:43:08PM +0100, Steven Price wrote:
> diff --git a/arch/arm64/include/uapi/asm/kvm.h 
> b/arch/arm64/include/uapi/asm/kvm.h
> index 24223adae150..2b85a047c37d 100644
> --- a/arch/arm64/include/uapi/asm/kvm.h
> +++ b/arch/arm64/include/uapi/asm/kvm.h
> @@ -184,6 +184,20 @@ struct kvm_vcpu_events {
>   __u32 reserved[12];
>  };
>  
> +struct kvm_arm_copy_mte_tags {
> + __u64 guest_ipa;
> + __u64 length;
> + union {
> + void __user *addr;
> + __u64 padding;
> + };
> + __u64 flags;
> + __u64 reserved[2];
> +};

I know Marc asked for some reserved space in here but I'm not sure it's
the right place. And what's with the union of a 64-bit pointer and
64-bit padding, it doesn't change any layout? Maybe add the two reserved
values to the union in case we want to store something else in the
future.

Or maybe I'm missing something, I haven't checked how other KVM ioctls
work.

-- 
Catalin



Re: [PATCH 01/22] qapi/parser: Don't try to handle file errors

2021-04-27 Thread John Snow

On 4/27/21 9:47 AM, Markus Armbruster wrote:

John Snow  writes:


On 4/23/21 11:46 AM, Markus Armbruster wrote:

John Snow  writes:


The short-ish version of what motivates this patch is:

- The parser initializer does not possess adequate context to write a
good error message -- It tries to determine the caller's semantic
context.


I'm not sure I get what you're trying to say here.



I mean: this __init__ method does not *know* who is calling it or why.
Of course, *we* do, because the code base is finite and nobody else but
us is calling into it.

I mean to point out that the initializer has to do extra work (Just a
little) to determine what the calling context is and raise an error
accordingly.

Example: If we have a parent info context, we raise an error in the
context of the caller. If we don't, we have to create a new presumed
context (using the weird None SourceInfo object).


I guess you mean

 raise QAPISemError(incl_info or QAPISourceInfo(None, None, None),

I can't see other instances of messing with context.



Yes, and the string construction that follows, too. It's all about 
trying to understand who our caller is and raising an error appropriate 
for them on their behalf.



So I just mean to say:

"Let the caller, who unambiguously always has the exactly correct
context worry about what the error message ought to be."


- We don't want to allow QAPISourceInfo(None, None, None) to exist.
- Errors made using such an object are currently incorrect.
- It's not technically a semantic error if we cannot open the schema
- There are various typing constraints that make mixing these two cases
undesirable for a single special case.


These I understand.


- The current open block in parser's initializer will leak file
pointers, because it isn't using a with statement.


Uh, isn't the value returned by open() reference-counted?  @fp is the
only reference...



Yeah, eventually. O:-)

Whenever the GC runs. OK, it's not really an apocalypse error, but it
felt strange to rewrite a try/except and then write it using bad hygiene
on purpose in the name of a more isolated commit.


I agree use of with is an improvement (it's idiomatic).  We shouldn't
call it a leak fix, though.



OK. I'll reword it.


Here's the details in why this got written the way it did, and why a few
disparate issues are rolled into one commit. (They're hard to fix
separately without writing really weird stuff that'd be harder to
review.)

The error message string here is incorrect:


python3 qapi-gen.py 'fake.json'

qapi-gen.py: qapi-gen.py: can't read schema file 'fake.json': No such file or 
directory


Regressed in commit 52a474180a "qapi-gen: Separate arg-parsing from
generation" (v5.2.0).



Mea Culpa. Didn't realize it wasn't tested, and I didn't realize at the
time that the two kinds of errors here were treated differently.


Our tests cover the schema language, not qapi-gen's CLI language.  The
gap feels tolerable.


Before commit c615550df3 "qapi: Improve source file read error handling"
(v4.2.0), it was differently bad (uncaught exception).

Commit c615550df3 explains why the funny QAPISourceInfo exists:

  Reporting open or read failure for the main schema file needs a
  QAPISourceInfo representing "no source".  Make QAPISourceInfo cope
  with fname=None.



I am apparently not the first or the last person to dream of wanting a
QAPISourceInfo that represents "Actually, there's no source location!"


The commit turned QAPISourceInfo into the equivalent of a disjoint union
of

1. A position in a source file (.fname is a str)

2. "Not in any source file" (.fname is None)

This is somewhat similar to struct Location in C, which has

1. LOC_FILE: a position in a source file

2. LOC_CMDLINE: a range of command line arguments

3. LOC_NONE: no location information

Abstracting locations this way lets error_report() do the right thing
whether its complaining about the command line, a monitor command, or a
configuration file read with -readconfig.

Your patch demonstrates that qapi-gen has much less need for abstracting
sources: we use 2. "Not in any source file" only for reading the main
schema file.



Yes. I got the impression that you didn't want to pursue more abstract
QSI constructs based on earlier work, so going the other way and
*removing* them seemed like the faster way to achieve a clean type
system here.


In pursuing it, we find that QAPISourceInfo has a special accommodation
for when there's no filename.


Yes:

  def loc(self) -> str:
--> if self.fname is None:
--> return sys.argv[0]
  ret = self.fname
  if self.line is not None:
  ret += ':%d' % self.line
  return ret


Meanwhile, we intend to type info.fname as
str; something we always have.


Do you mean "as non-optional str"?



Yeah. I typed it originally as `str`, but the analyzer missed that we
check the field to see if it's None, whic

Re: [PATCH] hw/ide: Fix crash when plugging a piix3-ide device into the x-remote machine

2021-04-27 Thread Philippe Mathieu-Daudé
On 4/27/21 7:16 PM, John Snow wrote:
> On 4/27/21 9:54 AM, Stefan Hajnoczi wrote:
>> I suggest fixing this at the qdev level. Make piix3-ide have a
>> sub-device that inherits from ISA_DEVICE so it can only be instantiated
>> when there's an ISA bus.
>>
>> Stefan
> 
> My qdev knowledge is shaky. Does this imply that you agree with the
> direction of Thomas's patch, or do you just mean to disagree with Phil
> on his preferred course of action?

My understanding is a disagreement to both, with a 3rd direction :)

I agree with Stefan direction but I'm not sure (yet) that a sub-device
is the best (long-term) solution. I guess there is a design issue with
this device, and would like to understanding it first.

IIUC Stefan says the piix3-ide is both a PCI and IDE device, but QOM
only allow a single parent. Multiple QOM inheritance is resolved as
interfaces, but PCI/IDE qdev aren't interfaces, rather abstract objects.
So he suggests to embed an IDE device within the PCI piix3-ide device.

My view is the PIIX is a chipset that share stuffs between components,
and the IDE bus belongs to the chipset PCI root (or eventually the
PCI-ISA bridge, function #0). The IDE function would use the IDE bus
from its root parent as a linked property.
My problem is currently this device is user-creatable as a Frankenstein
single PCI function, out of its chipset. I'm not sure yet this is a
dead end or I could work something out.

Regards,

Phil.




Re: [PATCH 1/1] Set TARGET_PAGE_BITS to be 10 instead of 8 bits

2021-04-27 Thread Dr. David Alan Gilbert
* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On Sun, 11 Apr 2021 at 16:15, Richard Henderson
>  wrote:
> >
> > On 4/10/21 10:24 AM, Michael Rolnik wrote:
> > > Please review.
> >
> >
> > The first 256b is i/o, the next 768b are ram.  But having changed the page
> > size, it should mean that the first 1k are now treated as i/o.
> >
> > We do have a path by which instructions in i/o pages can be executed.  This
> > happens on some ARM board setups during cold boot.  But we do not save those
> > translations, so they run much much slower than it should.
> >
> > But perhaps in the case of AVR, "much much slower" really isn't visible?
> >
> > In general, I think changing the page size is wrong.  I also assume that
> > migration is largely irrelevant to this target.
> 
> Migration is irrelevant, but every target benefits from snapshot
> save-and-restore, and I think that uses the same codepaths ?

Yes it does.

My main problem for wanting this fixed is that I really wanted to add an
assert to stop us tripping over the page size/migration bits clash.

Dave

> -- PMM
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK




Re: [PATCH] Set the correct env->fpip for x86 float instructions [cleaned]

2021-04-27 Thread Richard Henderson

On 4/16/21 8:34 AM, Ziqiao Kong wrote:

+++ b/target/i386/tcg/translate.c
@@ -6337,7 +6337,10 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
  goto unknown_op;
  }
  }
+tcg_gen_movi_tl(s->tmp0, pc_start - s->cs_base);
+tcg_gen_st_tl(s->tmp0, cpu_env, offsetof(CPUX86State, fpip));


This placement is wrong because it catches instructions that should not modify 
FIP, like FINIT.


It might be best to set a flag around this case like

  bool update_fip;

  case 0xd8 .. 0xdf:
...
update_fip = true;
if (mod != 3) {
...
} else {
...
}
if (update_fip) {
...
}
break;

and set update_fip to false for the set of insns that either do not update FIP 
or clear it (8.1.8 x87 fpu instruction and data (operand) pointers).


I notice you're not saving FCS to go along with this, at least for 
CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 0.


And if you're going to this trouble, you might want to think about FDP+FDS as 
well.  It should be about the same amount of effort.



r~



[PATCH v2 15/15] target/ppc: Check cpu flags for prefixed insn support

2021-04-27 Thread Luis Pires
Prefixed instructions were introduced in Power ISA 3.1

Signed-off-by: Luis Pires 
---
 target/ppc/translate.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 7422ea4e13..f4802a4f08 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -7837,7 +7837,11 @@ static bool ppc_tr_breakpoint_check(DisasContextBase 
*dcbase, CPUState *cs,
 
 static bool is_prefix_insn(DisasContext *ctx, uint32_t insn)
 {
-/* TODO: Check ctx->insns_flags* for whether prefixes are supported. */
+if (!(ctx->insns_flags2 & PPC2_ISA310)) {
+/* Prefixed instructions are not supported */
+return false;
+}
+
 return opc1(insn) == 1;
 }
 
-- 
2.25.1




Re: [PATCH v11 1/6] arm64: mte: Sync tags for pages where PTE is untagged

2021-04-27 Thread Catalin Marinas
On Fri, Apr 16, 2021 at 04:43:04PM +0100, Steven Price wrote:
> A KVM guest could store tags in a page even if the VMM hasn't mapped
> the page with PROT_MTE. So when restoring pages from swap we will
> need to check to see if there are any saved tags even if !pte_tagged().
> 
> However don't check pages which are !pte_valid_user() as these will
> not have been swapped out.

You should remove the pte_valid_user() mention from the commit log as
well.

> diff --git a/arch/arm64/include/asm/pgtable.h 
> b/arch/arm64/include/asm/pgtable.h
> index e17b96d0e4b5..cf4b52a33b3c 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -312,7 +312,7 @@ static inline void set_pte_at(struct mm_struct *mm, 
> unsigned long addr,
>   __sync_icache_dcache(pte);
>  
>   if (system_supports_mte() &&
> - pte_present(pte) && pte_tagged(pte) && !pte_special(pte))
> + pte_present(pte) && (pte_val(pte) & PTE_USER) && !pte_special(pte))

I would add a pte_user() macro here or, if we restore the tags only when
the page is readable, use pte_access_permitted(pte, false). Also add a
comment why we do this.

There's also the pte_user_exec() case which may not have the PTE_USER
set (exec-only permission) but I don't think it matters. We don't do tag
checking on instruction fetches, so if the user adds a PROT_READ to it,
it would go through set_pte_at() again. I'm not sure KVM does anything
special with exec-only mappings at stage 2, I suspect they won't be
accessible by the guest (but needs checking).

>   mte_sync_tags(ptep, pte);
>  
>   __check_racy_pte_update(mm, ptep, pte);
> diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c
> index b3c70a612c7a..e016ab57ea36 100644
> --- a/arch/arm64/kernel/mte.c
> +++ b/arch/arm64/kernel/mte.c
> @@ -26,17 +26,23 @@ u64 gcr_kernel_excl __ro_after_init;
>  
>  static bool report_fault_once = true;
>  
> -static void mte_sync_page_tags(struct page *page, pte_t *ptep, bool 
> check_swap)
> +static void mte_sync_page_tags(struct page *page, pte_t *ptep, bool 
> check_swap,
> +bool pte_is_tagged)
>  {
>   pte_t old_pte = READ_ONCE(*ptep);
>  
>   if (check_swap && is_swap_pte(old_pte)) {
>   swp_entry_t entry = pte_to_swp_entry(old_pte);
>  
> - if (!non_swap_entry(entry) && mte_restore_tags(entry, page))
> + if (!non_swap_entry(entry) && mte_restore_tags(entry, page)) {
> + set_bit(PG_mte_tagged, &page->flags);
>   return;
> + }
>   }
>  
> + if (!pte_is_tagged || test_and_set_bit(PG_mte_tagged, &page->flags))
> + return;

I don't think we need another test_bit() here, it was done in the
caller (bar potential races which need more thought).

> +
>   page_kasan_tag_reset(page);
>   /*
>* We need smp_wmb() in between setting the flags and clearing the
> @@ -54,11 +60,13 @@ void mte_sync_tags(pte_t *ptep, pte_t pte)
>   struct page *page = pte_page(pte);
>   long i, nr_pages = compound_nr(page);
>   bool check_swap = nr_pages == 1;
> + bool pte_is_tagged = pte_tagged(pte);
>  
>   /* if PG_mte_tagged is set, tags have already been initialised */
>   for (i = 0; i < nr_pages; i++, page++) {
> - if (!test_and_set_bit(PG_mte_tagged, &page->flags))
> - mte_sync_page_tags(page, ptep, check_swap);
> + if (!test_bit(PG_mte_tagged, &page->flags))
> + mte_sync_page_tags(page, ptep, check_swap,
> +pte_is_tagged);
>   }
>  }

You were right in the previous thread that if we have a race, it's
already there even without your patches KVM patches.

If it's the same pte in a multithreaded app, we should be ok as the core
code holds the ptl (the arch code also holds the mmap_lock during
exception handling but only as a reader, so you can have multiple
holders).

If there are multiple ptes to the same page, for example mapped with
MAP_ANONYMOUS | MAP_SHARED, metadata recovery is done via
arch_swap_restore() before we even set the pte and with the page locked.
So calling lock_page() again in mte_restore_tags() would deadlock.

I can see that do_swap_page() also holds the page lock around
set_pte_at(), so I think we are covered.

Any other scenario I may have missed? My understanding is that if the
pte is the same, we have the ptl. Otherwise we have the page lock for
shared pages.

-- 
Catalin



[PATCH v2 13/15] target/ppc: Move D/DS/X-form integer stores to decodetree

2021-04-27 Thread Luis Pires
From: Richard Henderson 

These are all connected by macros in the legacy decoding.

Signed-off-by: Richard Henderson 
---
 target/ppc/insn32.decode   | 22 ++
 target/ppc/translate.c | 85 +-
 target/ppc/translate/fixedpoint-impl.c.inc | 84 +
 3 files changed, 109 insertions(+), 82 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index bf39ce5c15..df92f11558 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -57,6 +57,28 @@ LDU 111010 . . ..01 @DS
 LDX 01 . . . 010101 -   @X
 LDUX01 . . . 110101 -   @X
 
+### Fixed-Point Store Instructions
+
+STB 100110 . .  @D
+STBU100111 . .  @D
+STBX01 . . . 0011010111 -   @X
+STBUX   01 . . . 000111 -   @X
+
+STH 101100 . .  @D
+STHU101101 . .  @D
+STHX01 . . . 0110110111 -   @X
+STHUX   01 . . . 0110010111 -   @X
+
+STW 100100 . .  @D
+STWU100101 . .  @D
+STWX01 . . . 0010010111 -   @X
+STWUX   01 . . . 0010110111 -   @X
+
+STD 10 . . ..00 @DS
+STDU10 . . ..01 @DS
+STDX01 . . . 0010010101 -   @X
+STDUX   01 . . . 0010110101 -   @X
+
 ### Fixed-Point Arithmetic Instructions
 
 ADDI001110 . .  @D
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index a1f0e59afd..7422ea4e13 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -2481,7 +2481,9 @@ static void glue(gen_qemu_, stop)(DisasContext *ctx,  
  \
 tcg_gen_qemu_st_tl(val, addr, ctx->mem_idx, op);\
 }
 
+#if defined(TARGET_PPC64) || !defined(CONFIG_USER_ONLY)
 GEN_QEMU_STORE_TL(st8,  DEF_MEMOP(MO_UB))
+#endif
 GEN_QEMU_STORE_TL(st16, DEF_MEMOP(MO_UW))
 GEN_QEMU_STORE_TL(st32, DEF_MEMOP(MO_UL))
 
@@ -2614,52 +2616,6 @@ static void gen_lq(DisasContext *ctx)
 #endif
 
 /***  Integer store***/
-#define GEN_ST(name, stop, opc, type) \
-static void glue(gen_, name)(DisasContext *ctx)   \
-{ \
-TCGv EA;  \
-gen_set_access_type(ctx, ACCESS_INT); \
-EA = tcg_temp_new();  \
-gen_addr_imm_index(ctx, EA, 0);   \
-gen_qemu_##stop(ctx, cpu_gpr[rS(ctx->opcode)], EA);   \
-tcg_temp_free(EA);\
-}
-
-#define GEN_STU(name, stop, opc, type)\
-static void glue(gen_, stop##u)(DisasContext *ctx)\
-{ \
-TCGv EA;  \
-if (unlikely(rA(ctx->opcode) == 0)) { \
-gen_inval_exception(ctx, POWERPC_EXCP_INVAL_INVAL);   \
-return;   \
-} \
-gen_set_access_type(ctx, ACCESS_INT); \
-EA = tcg_temp_new();  \
-if (type == PPC_64B)  \
-gen_addr_imm_index(ctx, EA, 0x03);\
-else  \
-gen_addr_imm_index(ctx, EA, 0);   \
-gen_qemu_##stop(ctx, cpu_gpr[rS(ctx->opcode)], EA);   \
-tcg_gen_mov_tl(cpu_gpr[rA(ctx->opcode)], EA); \
-tcg_temp_free(EA);\
-}
-
-#define GEN_STUX(name, stop, opc2, opc3, type)\
-static void glue(gen_, name##ux)(DisasContext *ctx)   \
-{ \
-TCGv EA;  \
-if (unlikely(rA(ctx->op

[PATCH v2 12/15] target/ppc: Implement prefixed integer load instructions

2021-04-27 Thread Luis Pires
From: Richard Henderson 

Signed-off-by: Richard Henderson 
---
 target/ppc/insn64.decode   | 15 ++
 target/ppc/translate/fixedpoint-impl.c.inc | 60 ++
 2 files changed, 75 insertions(+)

diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 9bef32a845..2e08d89e62 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -26,6 +26,21 @@
 .. rt:5 ra:5    \
 &PLS_D si=%pls_si
 
+### Fixed-Point Load Instructions
+
+PLBZ01 10 0--.-- .. \
+100010 . .  @PLS_D
+PLHZ01 10 0--.-- .. \
+101000 . .  @PLS_D
+PLHA01 10 0--.-- .. \
+101010 . .  @PLS_D
+PLWZ01 10 0--.-- .. \
+10 . .  @PLS_D
+PLWA01 00 0--.-- .. \
+101001 . .  @PLS_D
+PLD 01 00 0--.-- .. \
+111001 . .  @PLS_D
+
 ### Fixed-Point Arithmetic Instructions
 
 PADDI   01 10 0--.-- .. \
diff --git a/target/ppc/translate/fixedpoint-impl.c.inc 
b/target/ppc/translate/fixedpoint-impl.c.inc
index e15e379931..80f849fc4a 100644
--- a/target/ppc/translate/fixedpoint-impl.c.inc
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -218,6 +218,66 @@ static bool trans_LDUX(DisasContext *ctx, arg_X *a)
 return do_ldst_X(ctx, a, true, false, MO_Q);
 }
 
+static bool do_ldst_PLS_D(DisasContext *ctx, arg_PLS_D *a,
+  bool store, MemOp mop)
+{
+TCGv ea;
+
+if (!resolve_PLS_D(ctx, a)) {
+return false;
+}
+gen_set_access_type(ctx, ACCESS_INT);
+
+ea = tcg_temp_new();
+if (a->ra) {
+tcg_gen_addi_tl(ea, cpu_gpr[a->ra], a->si);
+} else {
+tcg_gen_movi_tl(ea, a->si);
+}
+if (NARROW_MODE(ctx)) {
+tcg_gen_ext32u_tl(ea, ea);
+}
+mop ^= ctx->default_tcg_memop_mask;
+if (store) {
+tcg_gen_qemu_st_tl(cpu_gpr[a->rt], ea, ctx->mem_idx, mop);
+} else {
+tcg_gen_qemu_ld_tl(cpu_gpr[a->rt], ea, ctx->mem_idx, mop);
+}
+tcg_temp_free(ea);
+
+return true;
+}
+
+static bool trans_PLBZ(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, false, MO_UB);
+}
+
+static bool trans_PLHZ(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, false, MO_UW);
+}
+
+static bool trans_PLHA(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, false, MO_SW);
+}
+
+static bool trans_PLWZ(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, false, MO_UL);
+}
+
+static bool trans_PLWA(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, false, MO_SL);
+}
+
+static bool trans_PLD(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, false, MO_Q);
+}
+
 static bool trans_ADDI(DisasContext *ctx, arg_D *a)
 {
 if (a->ra) {
-- 
2.25.1




[PATCH v2 14/15] target/ppc: Implement prefixed integer store instructions

2021-04-27 Thread Luis Pires
From: Richard Henderson 

Signed-off-by: Richard Henderson 
---
 target/ppc/insn64.decode   | 12 
 target/ppc/translate/fixedpoint-impl.c.inc | 20 
 2 files changed, 32 insertions(+)

diff --git a/target/ppc/insn64.decode b/target/ppc/insn64.decode
index 2e08d89e62..0f3b0b2725 100644
--- a/target/ppc/insn64.decode
+++ b/target/ppc/insn64.decode
@@ -41,6 +41,18 @@ PLWA01 00 0--.-- .. \
 PLD 01 00 0--.-- .. \
 111001 . .  @PLS_D
 
+### Fixed-Point Store Instructions
+
+PSTW01 10 0--.-- .. \
+100100 . .  @PLS_D
+PSTB01 10 0--.-- .. \
+100110 . .  @PLS_D
+PSTH01 10 0--.-- .. \
+101100 . .  @PLS_D
+
+PSTD01 00 0--.-- .. \
+01 . .  @PLS_D
+
 ### Fixed-Point Arithmetic Instructions
 
 PADDI   01 10 0--.-- .. \
diff --git a/target/ppc/translate/fixedpoint-impl.c.inc 
b/target/ppc/translate/fixedpoint-impl.c.inc
index b36011a539..4ba477eb93 100644
--- a/target/ppc/translate/fixedpoint-impl.c.inc
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -362,6 +362,26 @@ static bool trans_PLD(DisasContext *ctx, arg_PLS_D *a)
 return do_ldst_PLS_D(ctx, a, false, MO_Q);
 }
 
+static bool trans_PSTB(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, true, MO_UB);
+}
+
+static bool trans_PSTH(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, true, MO_UW);
+}
+
+static bool trans_PSTW(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, true, MO_UL);
+}
+
+static bool trans_PSTD(DisasContext *ctx, arg_PLS_D *a)
+{
+return do_ldst_PLS_D(ctx, a, true, MO_Q);
+}
+
 static bool trans_ADDI(DisasContext *ctx, arg_D *a)
 {
 if (a->ra) {
-- 
2.25.1




  1   2   3   >