date:20230222

Re: [PATCH v5 8/8] hw/mem/cxl_type3: Add CXL RAS Error Injection Support.

2023-02-22 Thread Markus Armbruster

Thomas Huth  writes:

> On 22/02/2023 19.16, Philippe Mathieu-Daudé wrote:
>> +Thomas (meson) & Marc-André (conditional QAPI)
>
> + Markus
>
>> On 22/2/23 17:49, Jonathan Cameron wrote:

[...]

>> Doesn't these need
>>
>>     'if': 'CONFIG_CXL_MEM_DEVICE',
>>
>> ?
>
> If I make this change I get a bunch of
>
> ./qapi/qapi-types-cxl.h:18:13: error: attempt to use poisoned 
> "CONFIG_CXL_MEM_DEVICE"
>  18 | #if defined(CONFIG_CXL_MEM_DEVICE)

 Err, I meant the generic CONFIG_CXL, not CONFIG_CXL_MEM_DEVICE.

> It's a target specific define (I think) as built alongside PCI_EXPRESS
> Only CXL_ACPI is specifically included by x86 and arm64 (out of tree)
>
> To be honest though I don't fully understand the QEMU build system so the 
> reason
> for the error might be wrong.

 You need to restrict to system emulation (the 'have_system' check):
>>>
>>> This doesn't help - still have
>>> attempt to used poisoned "CONFIG_CXL"
>
> Not sure how the QAPI generator works, but target specific config switches 
> can only be used in target specific json files there, so that's 
> machine-target.json and misc-target.json currently, as far as I know. Not 
> sure how the QAPI generator distinguishes between common and target specific 
> code, though ... just by the "-target" suffix? Maybe Markus or Marc-André can 
> comment on that.

Whenever you use a poisoned macro in a conditional, all the code
generated for this .json file (we call it a "QAPI schema module")
becomes target-dependent.  The QAPI code generator itself is blissfully
unaware of this.

Since target-dependent code needs to be compiled differently, the build
process needs to be know which modules are target-dependent.  We do this
in one of the stupidest ways that could possibly work: a module is
target-dependent if its name ends with "-target".  There are just two
right now: qapi/machine-target.json and qapi/misc-target.json.

The logic resides in qapi/meson.build.  Look for

if module.endswith('-target')

Questions?

[...]

Re: [PATCH v3 1/1] vhost-user-fs: add migration type property

2023-02-22 Thread Michael S. Tsirkin

On Wed, Feb 22, 2023 at 03:21:42PM -0500, Michael S. Tsirkin wrote:
> On Wed, Feb 22, 2023 at 08:25:19PM +0200, Anton Kuchin wrote:
> > On 22/02/2023 19:12, Michael S. Tsirkin wrote:
> > > On Wed, Feb 22, 2023 at 07:05:47PM +0200, Anton Kuchin wrote:
> > > > On 22/02/2023 18:51, Michael S. Tsirkin wrote:
> > > > > On Wed, Feb 22, 2023 at 06:49:10PM +0200, Anton Kuchin wrote:
> > > > > > On 22/02/2023 17:14, Vladimir Sementsov-Ogievskiy wrote:
> > > > > > > On 22.02.23 17:25, Anton Kuchin wrote:
> > > > > > > > > > > +static int vhost_user_fs_pre_save(void *opaque)
> > > > > > > > > > > +{
> > > > > > > > > > > +    VHostUserFS *fs = opaque;
> > > > > > > > > > > +    g_autofree char *path = 
> > > > > > > > > > > object_get_canonical_path(OBJECT(fs));
> > > > > > > > > > > +
> > > > > > > > > > > +    switch (fs->migration_type) {
> > > > > > > > > > > +    case VHOST_USER_MIGRATION_TYPE_NONE:
> > > > > > > > > > > +    error_report("Migration is blocked by device 
> > > > > > > > > > > %s", path);
> > > > > > > > > > > +    break;
> > > > > > > > > > > +    case VHOST_USER_MIGRATION_TYPE_EXTERNAL:
> > > > > > > > > > > +    return 0;
> > > > > > > > > > > +    default:
> > > > > > > > > > > +    error_report("Migration type '%s' is not
> > > > > > > > > > > supported by device %s",
> > > > > > > > > > > + VhostUserMigrationType_str(fs->migration_type), path);
> > > > > > > > > > > +    break;
> > > > > > > > > > > +    }
> > > > > > > > > > > +
> > > > > > > > > > > +    return -1;
> > > > > > > > > > > +}
> > > > > > > > > > Should we also add this as .pre_load, to force user select
> > > > > > > > > > correct migration_type on target too?
> > > > > > > > > In fact, I would claim we only want pre_load.
> > > > > > > > > When qemu is started on destination we know where it's 
> > > > > > > > > migrated
> > > > > > > > > from so this flag can be set.
> > > > > > > > > When qemu is started on source we generally do not yet know so
> > > > > > > > > we don't know whether it's safe to set this flag.
> > > > > > > But destination is a "source" for next migration, so there 
> > > > > > > shouldn't be
> > > > > > > real difference.
> > > > > > > The new property has ".realized_set_allowed = true", so, as I 
> > > > > > > understand
> > > > > > > it may be changed at any time, so that's not a problem.
> > > > > > Yes, exactly. So destination's property sets not how it will handle 
> > > > > > this
> > > > > > incoming
> > > > > > migration but the future outgoing one.
> > > > > How do you know where you are going to migrate though?
> > > > > I think you don't.
> > > > > Setting it on source is better since we know where we
> > > > > are migrating from.
> > > > Yes, I don't know where I'm going to migrate to. This is why property
> > > > affects only how source saves state on outgoing migration.
> > > Um. I don't get the logic.
> > 
> > For this feature to work we need orchestrator to manage the migration. And
> > we
> > generally assume that it is responsibility of orchestrator to ensure
> > matching
> > properties on source and destination.
> > As orchestrator manages both sides of migration it can set option (and we
> > can
> > check it) on either source or destination. Now it's not important which side
> > we
> > select, because now the option is essentially binary allow/deny (but IMHO it
> > is much better to refuse source to migrate than find later that state can't
> > be
> > loaded by destination, in case of file migration this becomes especially
> > painful).
> > 
> > But there are plans to add internal migration option (extract FUSE state
> > from
> > backend and transfer it in QEMU migration stream), and that's where
> > setting/checking
> > on source becomes important because it will rely on this property to decide
> > if
> > extra state form backend needs to be put in the migration stream subsection.
> 
> 
> If we do internal migration that will be a different property
> which has to match on source *and* destination.
> 
> 
> > If you are concerned about orchestrator breaking assumption of matching
> > properties
> > on source and destination this is not really supported AFAIK but I don't
> > think we
> > need to punish it for this, maybe it has its reasons: I can imagine scenario
> > where orchestrator could want to migrate from source with
> > 'migration=external'
> > to destination with 'migration=none' to ensure that destination can't be
> > migrated further.
> 
> No. I am concerned about a simple practical matter:
> - I decide to restart qemu on the same host - so I need to enable
>   migration
> - Later I decide to migrate qemu to another host - this should be
>   blocked
> 
> 
> Property on source does not satisfy both at the same time.
> Property on destination does.


Stefan what's your take on this? Should we move this from
save to load hook?

> 
> 
> > > 
> > > 
> > > > > > > > This property selects if VM can migrate and if it can what 
> > > > > > > > should
> > > > > >

Re: Questions about how block devices use snapshots

2023-02-22 Thread Zhiyong Ye


Hi Kevin,

Thank you for your reply and this method works.

May I ask if this 'image-end-offset' field can be shown in the qemu-img 
info too? Because it is also a very useful information whether qcow2 is 
placed on a file or a block device.


Regards

Zhiyong

On 2/21/23 11:58 PM, Kevin Wolf wrote:

Am 21.02.2023 um 14:27 hat Zhiyong Ye geschrieben:


Hi Kevin,

Sorry to bother you again.

I intend to use this approach for snapshots of block devices, which, as you
say, requires a lot of disk space to store snapshot data. So, to save disk
space, after each successful external snapshot creation, I want to shrink
the block device that stores the backing_file image to the size that qcow2
data actually occupies, since it has become read-only. But there is no way
to get the actual size of qcow2 when it is stored in a block device.

Qemu-img info can easily get the actual size of qcow2 when it is stored in a
file using the fstat function, but this will fail and return 0 for block
devices. Therefore, it is necessary to implement the method of getting data
occupancy inside qcow2. I think there may be two possible ways to do this:

- Add a cluster count field @nb_clusters in the BDRVQcow2State for each new
cluster allocated and the actual size occupied by qcow2 is: nb_clusters *
cluster_size.
- Iterate through the refcount block to find the value with the largest host
offset, and this is the actual size occupied by qcow2.

Since I'm not very familiar with qcow2, may I ask if you have any advice on
getting the actual size when using qcow2?


I think what you need is the 'image-end-offset' field from 'qemu-img
check --output=json'.

Kevin

Re: [PATCH] qapi: allow unions to contain further unions

2023-02-22 Thread Markus Armbruster

Daniel P. Berrangé  writes:

> This extends the QAPI schema validation to permit unions inside unions,
> provided the checks for clashing fields pass.
>
> Signed-off-by: Daniel P. Berrangé 
> ---
>
> This patch comes out of the discussion on Het's migration series
> starting at this patch:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg02111.html
>
> Markus had described his desired improved architecture
>
>   https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg02719.html
>
> but I don't think I have enough knowledge of the QAPI code to attempt
> to fuse the handling of structs/unions as mentioned. This patch does
> what looks to be the bare minimum to permit unions in unions, while
> keeping validation checks for clashing fields.
>
> I've not tested beyond the unit tests, but if this is acceptable
> from Markus' POV, I'd expect Het to insert this patch at the
> start of his migration series and thus test it more fully.
>
>  scripts/qapi/schema.py|  6 +--
>  .../union-invalid-union-subfield.err  |  2 +
>  .../union-invalid-union-subfield.json | 27 +
>  .../union-invalid-union-subfield.out  |  0
>  .../union-invalid-union-subtype.err   |  2 +
>  .../union-invalid-union-subtype.json  | 26 +
>  .../union-invalid-union-subtype.out   |  0
>  tests/qapi-schema/union-union-branch.err  |  0
>  tests/qapi-schema/union-union-branch.json | 26 +
>  tests/qapi-schema/union-union-branch.out  | 38 +++
>  10 files changed, 124 insertions(+), 3 deletions(-)
>  create mode 100644 tests/qapi-schema/union-invalid-union-subfield.err
>  create mode 100644 tests/qapi-schema/union-invalid-union-subfield.json
>  create mode 100644 tests/qapi-schema/union-invalid-union-subfield.out
>  create mode 100644 tests/qapi-schema/union-invalid-union-subtype.err
>  create mode 100644 tests/qapi-schema/union-invalid-union-subtype.json
>  create mode 100644 tests/qapi-schema/union-invalid-union-subtype.out
>  create mode 100644 tests/qapi-schema/union-union-branch.err
>  create mode 100644 tests/qapi-schema/union-union-branch.json
>  create mode 100644 tests/qapi-schema/union-union-branch.out
>
> diff --git a/scripts/qapi/schema.py b/scripts/qapi/schema.py
> index cd8661125c..062c6bbb00 100644
> --- a/scripts/qapi/schema.py
> +++ b/scripts/qapi/schema.py
> @@ -465,9 +465,10 @@ def check(self, schema):
>  # on behalf of info, which is not necessarily self.info
>  def check_clash(self, info, seen):
>  assert self._checked
> -assert not self.variants   # not implemented
>  for m in self.members:
>  m.check_clash(info, seen)
> +if self.variants:
> +self.variants.check_clash(info, seen)

Note for later: the .check_clash() methods are responsible for rejecting
clashing members, with an error message of the form "X collides with Y".

Fine print 1: members clash when their names both map to the same C
name.  For instance, 'a-b' collides with 'a_b'.

Fine print 2: the special case of identical keys in a single JSON-ish
object is already rejected by the parser, with an error message of the
form "duplicate key 'KEY'".

>  
>  def connect_doc(self, doc=None):
>  super().connect_doc(doc)
> @@ -652,8 +653,7 @@ def check(self, schema, seen):
>  self.info,
>  "branch '%s' is not a value of %s"
>  % (v.name, self.tag_member.type.describe()))
> -if (not isinstance(v.type, QAPISchemaObjectType)
> -or v.type.variants):
> +if not isinstance(v.type, QAPISchemaObjectType):
>  raise QAPISemError(
>  self.info,
>  "%s cannot use %s"

This lifts the restriction; an object type's variant type may now have
variants.  Could affect any code that deals with object type members.

Best case: the code just works.

Okay case: the code asserts there are no variants.  This patch needs to
make it work instead.  One known instance: check_clash() above.  I
looked for more, and there are a few "no variants" assertions, but they
are all unrelated.

Worst case: the code doesn't work.  This patch needs to make it work.
No known instances.

Two complementary ways to convince ourselves everything works:
systematic code inspection, systematic tests.

The former looks at every place where we do something with object type
members.  I may try that later.

For systematic tests, we need to understand what can go wrong, and what
needs to work.  I tried to work out a detailed argument, but it didn't
come together.  Best I can do is to simply propose that the additional
variant members of a union's branch may clash with the union's common
members, but not with any other branch's members, and that's all.

We need to test the clash is rejected (negative test), and we need to

Re: [PATCH] hw/smbios: fix field corruption in type 4 table

2023-02-22 Thread Michael S. Tsirkin

On Wed, Feb 22, 2023 at 10:00:49PM +0100, Julia Suvorova wrote:
> Since table type 4 of SMBIOS version 2.6 is shorter than 3.0, the
> strings which follow immediately after the struct fields have been
> overwritten by unconditional filling of later fields such as core_count2.
> Make these fields dependent on the SMBIOS version.
> 
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2169904

Could you also add a Fixes tag with the commit that introduces the bug?

> 
> Signed-off-by: Julia Suvorova 
> ---
>  hw/smbios/smbios.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
> index b4243de735..903fd22350 100644
> --- a/hw/smbios/smbios.c
> +++ b/hw/smbios/smbios.c
> @@ -749,14 +749,16 @@ static void smbios_build_type_4_table(MachineState *ms, 
> unsigned instance)
>  t->core_count = (ms->smp.cores > 255) ? 0xFF : ms->smp.cores;
>  t->core_enabled = t->core_count;
>  
> -t->core_count2 = t->core_enabled2 = cpu_to_le16(ms->smp.cores);
> -
>  t->thread_count = (ms->smp.threads > 255) ? 0xFF : ms->smp.threads;
> -t->thread_count2 = cpu_to_le16(ms->smp.threads);
>  
>  t->processor_characteristics = cpu_to_le16(0x02); /* Unknown */
>  t->processor_family2 = cpu_to_le16(0x01); /* Other */
>  
> +if (smbios_ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
> +t->core_count2 = t->core_enabled2 = cpu_to_le16(ms->smp.cores);
> +t->thread_count2 = cpu_to_le16(ms->smp.threads);
> +}
> +
>  SMBIOS_BUILD_TABLE_POST;
>  smbios_type4_count++;
>  }
> -- 
> 2.38.1

Re: [PATCH] hw/i386: fix microvm segfault with virtio cmdline

2023-02-22 Thread Michael S. Tsirkin

didn't read the patch yet but just formatting comments:

On Wed, Feb 22, 2023 at 10:39:10PM -0800, Daniel Hoffman wrote:
> The 'microvm' machine type allows for disabling ACPI, in which case
> the VirtIO device configuration is passed via appending it to the
> kernel cmdline.
> 
> If no cmdline parameter was passed, then a null pointer is dereferenced when
> the new cmdline is copied back. A solution is to always define the cmdline
> in the fw_cfg so the read to append happens before the first write in the
> multiboot case, and to explcitly re-write the value to update the length.

explicitly

> 
> Fixes: eac7a7791b

format is:

Fixes: hash ("subject")

> 
> Signed-off-by: Daniel Hoffman 
> ---
>  hw/i386/microvm.c | 3 ++-
>  hw/i386/x86.c | 4 
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
> index 29f30dd6d3..be64280530 100644
> --- a/hw/i386/microvm.c
> +++ b/hw/i386/microvm.c
> @@ -417,7 +417,8 @@ static void microvm_fix_kernel_cmdline(MachineState 
> *machine)
>  if (len > VIRTIO_CMDLINE_TOTAL_MAX_LEN + strlen(existing_cmdline)) {
>  fprintf(stderr, "qemu: virtio mmio cmdline too large, skipping\n");
>  } else {
> -memcpy(existing_cmdline, cmdline, len + 1);
> + fw_cfg_modify_i32(x86ms->fw_cfg, FW_CFG_CMDLINE_SIZE, len + 1);
> + fw_cfg_modify_string(x86ms->fw_cfg, FW_CFG_CMDLINE_DATA, cmdline);

Pls use spaces not tabs same as surrounding code.

>  }
>  g_free(cmdline);
>  }
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index eaff4227bd..7dd02b7409 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -827,6 +827,10 @@ void x86_load_linux(X86MachineState *x86ms,
>  /* Make a copy, since we might append arbitrary bytes to it later. */
>  kernel_cmdline = g_strndup(machine->kernel_cmdline, cmdline_size);
>  
> +/* If the cmdline is undefined, set it as an empty allocated value */
> +fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, cmdline_size);
> +fw_cfg_add_bytes(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline, 
> cmdline_size);
> +
>  /* load the kernel header */
>  f = fopen(kernel_filename, "rb");
>  if (!f) {
> -- 
> 2.37.2

[PATCH V2 0/5] Fix UNMAP notifier for intel-iommu

2023-02-22 Thread Jason Wang

Hi All:

According to ATS, device should work if ATS is disabled. This is not
correctly implemented in the current intel-iommu since it doesn't
handle the UNMAP notifier correctly. This breaks the vhost-net +
vIOMMU without dt.

The root casue is that the when there's a device IOTLB miss (note that
it's not specific to PCI so it can work without ATS), Qemu doesn't
build the IOVA tree, so when guest start an IOTLB invalidation, Qemu
won't trigger the UNMAP notifier.

Fixing this by triggering UNMAP notifier in those cases.

Thanks

Changes since V1:

- Do not depend on the iova tree for such kind of invalidation but
  simply tries to do UNMAP for all attached IOMMU notifiers

Jason Wang (4):
  intel-iommu: fail MAP notifier without caching mode
  intel-iommu: fail DEVIOTLB_UNMAP without dt mode
  memory: introduce memory_region_unmap_iommu_notifier_range()
  smmu: switch to use memory_region_unmap_iommu_notifier_range()

Peter Xu (1):
  intel-iommu: send UNMAP notifications for domain or global inv desc

 hw/arm/smmu-common.c  | 16 +---
 hw/i386/intel_iommu.c | 29 -
 include/exec/memory.h | 10 ++
 softmmu/memory.c  | 13 +
 4 files changed, 48 insertions(+), 20 deletions(-)

-- 
2.25.1

[PATCH V2 1/5] intel-iommu: fail MAP notifier without caching mode

2023-02-22 Thread Jason Wang

Without caching mode, MAP notifier won't work correctly since guest
won't send IOTLB update event when it establishes new mappings in the
I/O page tables. Let's fail the IOMMU notifiers early instead of
misbehaving silently.

Reviewed-by: Eric Auger 
Tested-by: Viktor Prutyanov 
Signed-off-by: Jason Wang 
---
 hw/i386/intel_iommu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 98a5c304a7..0de3e31577 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3186,6 +3186,13 @@ static int 
vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
  "Snoop Control with vhost or VFIO is not supported");
 return -ENOTSUP;
 }
+if (!s->caching_mode && (new & IOMMU_NOTIFIER_MAP)) {
+error_setg_errno(errp, ENOTSUP,
+ "device %02x.%02x.%x requires caching mode",
+ pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
+ PCI_FUNC(vtd_as->devfn));
+return -ENOTSUP;
+}
 
 /* Update per-address-space notifier flags */
 vtd_as->notifier_flags = new;
-- 
2.25.1

[PATCH V2 3/5] memory: introduce memory_region_unmap_iommu_notifier_range()

2023-02-22 Thread Jason Wang

This patch introduces a new helper to unmap the range of a specific
IOMMU notifier.

Signed-off-by: Jason Wang 
---
 include/exec/memory.h | 10 ++
 softmmu/memory.c  | 13 +
 2 files changed, 23 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 2e602a2fad..6fa0b071f0 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -1731,6 +1731,16 @@ void memory_region_notify_iommu(IOMMUMemoryRegion 
*iommu_mr,
 void memory_region_notify_iommu_one(IOMMUNotifier *notifier,
 IOMMUTLBEvent *event);
 
+/**
+ * memory_region_unmap_iommu_notifier_range: notify a unmap for an IOMMU
+ *   translation that covers the
+ *   range of a notifier
+ *
+ * @notifier: the notifier to be notified
+ */
+void memory_region_unmap_iommu_notifier_range(IOMMUNotifier *n);
+
+
 /**
  * memory_region_register_iommu_notifier: register a notifier for changes to
  * IOMMU translation entries.
diff --git a/softmmu/memory.c b/softmmu/memory.c
index 9d64efca26..ba43b4474e 100644
--- a/softmmu/memory.c
+++ b/softmmu/memory.c
@@ -1996,6 +1996,19 @@ void memory_region_notify_iommu_one(IOMMUNotifier 
*notifier,
 }
 }
 
+void memory_region_unmap_iommu_notifier_range(IOMMUNotifier *n)
+{
+IOMMUTLBEvent event;
+
+event.type = IOMMU_NOTIFIER_UNMAP;
+event.entry.target_as = &address_space_memory;
+event.entry.iova = n->start;
+event.entry.perm = IOMMU_NONE;
+event.entry.addr_mask = n->end - n->start;
+
+memory_region_notify_iommu_one(n, &event);
+}
+
 void memory_region_notify_iommu(IOMMUMemoryRegion *iommu_mr,
 int iommu_idx,
 IOMMUTLBEvent event)
-- 
2.25.1

[PATCH V2 5/5] intel-iommu: send UNMAP notifications for domain or global inv desc

2023-02-22 Thread Jason Wang

From: Peter Xu 

We don't send UNMAP notification upon domain or global invalidation
which will lead the notifier can't work correctly. One example is to
use vhost remote IOTLB without enabling device IOTLB.

Fixing this by sending UNMAP notification.

Signed-off-by: Peter Xu 
Signed-off-by: Jason Wang 
---
 hw/i386/intel_iommu.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f006fa6031..a62896759c 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -1530,13 +1530,17 @@ static int 
vtd_sync_shadow_page_table_range(VTDAddressSpace *vtd_as,
 return vtd_page_walk(s, ce, addr, addr + size, &info, vtd_as->pasid);
 }
 
-static int vtd_sync_shadow_page_table(VTDAddressSpace *vtd_as)
+static int vtd_address_space_sync(VTDAddressSpace *vtd_as)
 {
 int ret;
 VTDContextEntry ce;
 IOMMUNotifier *n;
 
-if (!(vtd_as->iommu.iommu_notify_flags & IOMMU_NOTIFIER_IOTLB_EVENTS)) {
+/* If no MAP notifier registered, we simply invalidate all the cache */
+if (!vtd_as_has_map_notifier(vtd_as)) {
+IOMMU_NOTIFIER_FOREACH(n, &vtd_as->iommu) {
+memory_region_unmap_iommu_notifier_range(n);
+}
 return 0;
 }
 
@@ -2000,7 +2004,7 @@ static void vtd_iommu_replay_all(IntelIOMMUState *s)
 VTDAddressSpace *vtd_as;
 
 QLIST_FOREACH(vtd_as, &s->vtd_as_with_notifiers, next) {
-vtd_sync_shadow_page_table(vtd_as);
+vtd_address_space_sync(vtd_as);
 }
 }
 
@@ -2082,7 +2086,7 @@ static void vtd_context_device_invalidate(IntelIOMMUState 
*s,
  * framework will skip MAP notifications if that
  * happened.
  */
-vtd_sync_shadow_page_table(vtd_as);
+vtd_address_space_sync(vtd_as);
 }
 }
 }
@@ -2140,7 +2144,7 @@ static void vtd_iotlb_domain_invalidate(IntelIOMMUState 
*s, uint16_t domain_id)
 if (!vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
   vtd_as->devfn, &ce) &&
 domain_id == vtd_get_domain_id(s, &ce, vtd_as->pasid)) {
-vtd_sync_shadow_page_table(vtd_as);
+vtd_address_space_sync(vtd_as);
 }
 }
 }
-- 
2.25.1

[PATCH V2 4/5] smmu: switch to use memory_region_unmap_iommu_notifier_range()

2023-02-22 Thread Jason Wang

Signed-off-by: Jason Wang 
---
 hw/arm/smmu-common.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 733c964778..5e2847d511 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -467,20 +467,6 @@ IOMMUMemoryRegion *smmu_iommu_mr(SMMUState *s, uint32_t 
sid)
 return NULL;
 }
 
-/* Unmap the whole notifier's range */
-static void smmu_unmap_notifier_range(IOMMUNotifier *n)
-{
-IOMMUTLBEvent event;
-
-event.type = IOMMU_NOTIFIER_UNMAP;
-event.entry.target_as = &address_space_memory;
-event.entry.iova = n->start;
-event.entry.perm = IOMMU_NONE;
-event.entry.addr_mask = n->end - n->start;
-
-memory_region_notify_iommu_one(n, &event);
-}
-
 /* Unmap all notifiers attached to @mr */
 static void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
 {
@@ -488,7 +474,7 @@ static void smmu_inv_notifiers_mr(IOMMUMemoryRegion *mr)
 
 trace_smmu_inv_notifiers_mr(mr->parent_obj.name);
 IOMMU_NOTIFIER_FOREACH(n, mr) {
-smmu_unmap_notifier_range(n);
+memory_region_unmap_iommu_notifier_range(n);
 }
 }
 
-- 
2.25.1

[PATCH V2 2/5] intel-iommu: fail DEVIOTLB_UNMAP without dt mode

2023-02-22 Thread Jason Wang

Without dt mode, device IOTLB notifier won't work since guest won't
send device IOTLB invalidation descriptor in this case. Let's fail
early instead of misbehaving silently.

Reviewed-by: Laurent Vivier 
Tested-by: Laurent Vivier 
Tested-by: Viktor Prutyanov 
Buglink: https://bugzilla.redhat.com/2156876
Signed-off-by: Jason Wang 
---
 hw/i386/intel_iommu.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 0de3e31577..f006fa6031 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3179,6 +3179,7 @@ static int 
vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
 {
 VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
 IntelIOMMUState *s = vtd_as->iommu_state;
+X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
 
 /* TODO: add support for VFIO and vhost users */
 if (s->snoop_control) {
@@ -3193,6 +3194,13 @@ static int 
vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
  PCI_FUNC(vtd_as->devfn));
 return -ENOTSUP;
 }
+if (!x86_iommu->dt_supported && (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP)) {
+error_setg_errno(errp, ENOTSUP,
+ "device %02x.%02x.%x requires device IOTLB mode",
+ pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
+ PCI_FUNC(vtd_as->devfn));
+return -ENOTSUP;
+}
 
 /* Update per-address-space notifier flags */
 vtd_as->notifier_flags = new;
-- 
2.25.1

Re: [PATCH v5 8/8] hw/mem/cxl_type3: Add CXL RAS Error Injection Support.

2023-02-22 Thread Thomas Huth


On 22/02/2023 19.16, Philippe Mathieu-Daudé wrote:

+Thomas (meson) & Marc-André (conditional QAPI)


+ Markus


On 22/2/23 17:49, Jonathan Cameron wrote:
+# Type of uncorrectable CXL error to inject. These errors are 
reported via
+# an AER uncorrectable internal error with additional information 
logged at

+# the CXL device.
+#
+# @cache-data-parity: Data error such as data parity or data ECC 
error CXL.cache
+# @cache-address-parity: Address parity or other errors associated 
with the

+#    address field on CXL.cache
+# @cache-be-parity: Byte enable parity or other byte enable errors on 
CXL.cache

+# @cache-data-ecc: ECC error on CXL.cache
+# @mem-data-parity: Data error such as data parity or data ECC error 
on CXL.mem
+# @mem-address-parity: Address parity or other errors associated with 
the

+#  address field on CXL.mem
+# @mem-be-parity: Byte enable parity or other byte enable errors on 
CXL.mem.

+# @mem-data-ecc: Data ECC error on CXL.mem.
+# @reinit-threshold: REINIT threshold hit.
+# @rsvd-encoding: Received unrecognized encoding.
+# @poison-received: Received poison from the peer.
+# @receiver-overflow: Buffer overflows (first 3 bits of header log 
indicate which)

+# @internal: Component specific error
+# @cxl-ide-tx: Integrity and data encryption tx error.
+# @cxl-ide-rx: Integrity and data encryption rx error.
+##
+
+{ 'enum': 'CxlUncorErrorType',


Doesn't these need

    'if': 'CONFIG_CXL_MEM_DEVICE',

?


If I make this change I get a bunch of

./qapi/qapi-types-cxl.h:18:13: error: attempt to use poisoned 
"CONFIG_CXL_MEM_DEVICE"

 18 | #if defined(CONFIG_CXL_MEM_DEVICE)


Err, I meant the generic CONFIG_CXL, not CONFIG_CXL_MEM_DEVICE.


It's a target specific define (I think) as built alongside PCI_EXPRESS
Only CXL_ACPI is specifically included by x86 and arm64 (out of tree)

To be honest though I don't fully understand the QEMU build system so 
the reason

for the error might be wrong.


You need to restrict to system emulation (the 'have_system' check):


This doesn't help - still have
attempt to used poisoned "CONFIG_CXL"


Not sure how the QAPI generator works, but target specific config switches 
can only be used in target specific json files there, so that's 
machine-target.json and misc-target.json currently, as far as I know. Not 
sure how the QAPI generator distinguishes between common and target specific 
code, though ... just by the "-target" suffix? Maybe Markus or Marc-André 
can comment on that.


See also:

 https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg01885.html
 https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg02001.html

 Thomas

Re: [PATCH] hw/virtio: added virtio-serial test cases

2023-02-22 Thread Dan Hoffman

Is there interest in this?


On Fri, Nov 11, 2022 at 10:33 PM Daniel Hoffman  wrote:
>
> The previous test cases for virtio-serial only tested initialization of
> the device. I've included four new test cases: rx for virtconsole, tx
> for virtconsole, rx for virtserialport, tx for virtserialport. It
> follows the general pattern of virtio-net (i.e. chardev file descriptor
> backend with a socketpair connected via fork-exec).
>
> Signed-off-by: Daniel Hoffman 
> ---
>  tests/qtest/libqos/virtio-serial.c |  51 +
>  tests/qtest/libqos/virtio-serial.h |   2 +
>  tests/qtest/virtio-serial-test.c   | 177 -
>  3 files changed, 228 insertions(+), 2 deletions(-)
>
> diff --git a/tests/qtest/libqos/virtio-serial.c 
> b/tests/qtest/libqos/virtio-serial.c
> index 1d689c3e38..8723bffe1b 100644
> --- a/tests/qtest/libqos/virtio-serial.c
> +++ b/tests/qtest/libqos/virtio-serial.c
> @@ -22,6 +22,10 @@
>  #include "qgraph.h"
>  #include "virtio-serial.h"
>
> +#include "qemu/iov.h"
> +
> +static QGuestAllocator *alloc;
> +
>  static void *qvirtio_serial_get_driver(QVirtioSerial *v_serial,
> const char *interface)
>  {
> @@ -43,6 +47,33 @@ static void *qvirtio_serial_device_get_driver(void *object,
>  return qvirtio_serial_get_driver(&v_serial->serial, interface);
>  }
>
> +static void virtio_serial_setup(QVirtioSerial *interface)
> +{
> +QVirtioDevice *vdev = interface->vdev;
> +qvirtio_set_features(vdev, (1ULL << 1) | (1ULL << 32));
> +
> +interface->n_queues = 6;
> +interface->queues = g_new(QVirtQueue*, interface->n_queues);
> +
> +for (int i = 0; i < interface->n_queues; i++) {
> +interface->queues[i] = qvirtqueue_setup(interface->vdev, alloc, i);
> +}
> +
> +qvirtio_set_driver_ok(vdev);
> +}
> +
> +static void qvirtio_serial_device_destructor(QOSGraphObject *obj)
> +{
> +}
> +
> +static void qvirtio_serial_device_start_hw(QOSGraphObject *obj)
> +{
> +QVirtioSerialDevice *v_serial = (QVirtioSerialDevice *)obj;
> +QVirtioSerial *interface = &v_serial->serial;
> +
> +virtio_serial_setup(interface);
> +}
> +
>  static void *virtio_serial_device_create(void *virtio_dev,
>   QGuestAllocator *t_alloc,
>   void *addr)
> @@ -51,13 +82,30 @@ static void *virtio_serial_device_create(void *virtio_dev,
>  QVirtioSerial *interface = &virtio_device->serial;
>
>  interface->vdev = virtio_dev;
> +alloc = t_alloc;
>
> +virtio_device->obj.destructor = qvirtio_serial_device_destructor;
> +virtio_device->obj.start_hw = qvirtio_serial_device_start_hw;
>  virtio_device->obj.get_driver = qvirtio_serial_device_get_driver;
>
>  return &virtio_device->obj;
>  }
>
>  /* virtio-serial-pci */
> +static void qvirtio_serial_pci_destructor(QOSGraphObject *obj)
> +{
> +}
> +
> +static void qvirtio_serial_pci_start_hw(QOSGraphObject *obj)
> +{
> +QVirtioSerialPCI *v_serial = (QVirtioSerialPCI *) obj;
> +QVirtioSerial *interface = &v_serial->serial;
> +QOSGraphObject *pci_vobj = &v_serial->pci_vdev.obj;
> +
> +qvirtio_pci_start_hw(pci_vobj);
> +virtio_serial_setup(interface);
> +}
> +
>  static void *qvirtio_serial_pci_get_driver(void *object, const char 
> *interface)
>  {
>  QVirtioSerialPCI *v_serial = object;
> @@ -76,7 +124,10 @@ static void *virtio_serial_pci_create(void *pci_bus, 
> QGuestAllocator *t_alloc,
>
>  virtio_pci_init(&virtio_spci->pci_vdev, pci_bus, addr);
>  interface->vdev = &virtio_spci->pci_vdev.vdev;
> +alloc = t_alloc;
>
> +obj->destructor = qvirtio_serial_pci_destructor;
> +obj->start_hw = qvirtio_serial_pci_start_hw;
>  obj->get_driver = qvirtio_serial_pci_get_driver;
>
>  return obj;
> diff --git a/tests/qtest/libqos/virtio-serial.h 
> b/tests/qtest/libqos/virtio-serial.h
> index 3db43b2bb8..ce6ae164cb 100644
> --- a/tests/qtest/libqos/virtio-serial.h
> +++ b/tests/qtest/libqos/virtio-serial.h
> @@ -29,6 +29,8 @@ typedef struct QVirtioSerialDevice QVirtioSerialDevice;
>
>  struct QVirtioSerial {
>  QVirtioDevice *vdev;
> +int n_queues;
> +QVirtQueue **queues;
>  };
>
>  struct QVirtioSerialPCI {
> diff --git a/tests/qtest/virtio-serial-test.c 
> b/tests/qtest/virtio-serial-test.c
> index 2541034822..190075d6f5 100644
> --- a/tests/qtest/virtio-serial-test.c
> +++ b/tests/qtest/virtio-serial-test.c
> @@ -11,6 +11,36 @@
>  #include "libqtest-single.h"
>  #include "qemu/module.h"
>  #include "libqos/virtio-serial.h"
> +#include "standard-headers/linux/virtio_console.h"
> +#include "qemu/iov.h"
> +
> +static void virtio_serial_test_cleanup(void *sockets)
> +{
> +int *sv = sockets;
> +
> +close(sv[0]);
> +qos_invalidate_command_line();
> +close(sv[1]);
> +g_free(sv);
> +}
> +
> +static void *virtio_serial_test_setup(GString *cmd_line, void *arg)
> +{
> +int ret;
> +int *sv = g_new(int, 3);
> +
> +ret = soc

[PATCH] hw/i386: fix microvm segfault with virtio cmdline

2023-02-22 Thread Daniel Hoffman

The 'microvm' machine type allows for disabling ACPI, in which case
the VirtIO device configuration is passed via appending it to the
kernel cmdline.

If no cmdline parameter was passed, then a null pointer is dereferenced when
the new cmdline is copied back. A solution is to always define the cmdline
in the fw_cfg so the read to append happens before the first write in the
multiboot case, and to explcitly re-write the value to update the length.

Fixes: eac7a7791b

Signed-off-by: Daniel Hoffman 
---
 hw/i386/microvm.c | 3 ++-
 hw/i386/x86.c | 4 
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
index 29f30dd6d3..be64280530 100644
--- a/hw/i386/microvm.c
+++ b/hw/i386/microvm.c
@@ -417,7 +417,8 @@ static void microvm_fix_kernel_cmdline(MachineState 
*machine)
 if (len > VIRTIO_CMDLINE_TOTAL_MAX_LEN + strlen(existing_cmdline)) {
 fprintf(stderr, "qemu: virtio mmio cmdline too large, skipping\n");
 } else {
-memcpy(existing_cmdline, cmdline, len + 1);
+   fw_cfg_modify_i32(x86ms->fw_cfg, FW_CFG_CMDLINE_SIZE, len + 1);
+   fw_cfg_modify_string(x86ms->fw_cfg, FW_CFG_CMDLINE_DATA, cmdline);
 }
 g_free(cmdline);
 }
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index eaff4227bd..7dd02b7409 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -827,6 +827,10 @@ void x86_load_linux(X86MachineState *x86ms,
 /* Make a copy, since we might append arbitrary bytes to it later. */
 kernel_cmdline = g_strndup(machine->kernel_cmdline, cmdline_size);
 
+/* If the cmdline is undefined, set it as an empty allocated value */
+fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, cmdline_size);
+fw_cfg_add_bytes(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline, 
cmdline_size);
+
 /* load the kernel header */
 f = fopen(kernel_filename, "rb");
 if (!f) {
-- 
2.37.2

Re: [PATCH v7 03/10] target/riscv: allow MISA writes as experimental

2023-02-22 Thread Andrew Jones

On Wed, Feb 22, 2023 at 03:51:58PM -0300, Daniel Henrique Barboza wrote:
> At this moment, and apparently since ever, we have no way of enabling
> RISCV_FEATURE_MISA. This means that all the code from write_misa(), all
> the nuts and bolts that handles how to properly write this CSR, has
> always been a no-op as well because write_misa() will always exit
> earlier.
> 
> This seems to be benign in the majority of cases. Booting an Ubuntu
> 'virt' guest and logging all the calls to 'write_misa' shows that no
> writes to MISA CSR was attempted. Writing MISA, i.e. enabling/disabling
> RISC-V extensions after the machine is powered on, seems to be a niche
> use.
> 
> After discussions in the mailing list, most notably in [1], we reached
> the consensus that this code is not suited to be exposed to users
> because it's not well tested, but at the same time removing it is a bit
> extreme because we would like to fix it, and it's easier to do so with
> the code available to use instead of fetching it from git log.
> 
> The approach taken here is to get rid of RISCV_FEATURE_MISA altogether
> and use a new experimental flag called x-misa-w. The default value is
> false, meaning that we're keeping the existing behavior of doing nothing
> if a write_misa() is attempted. As with any existing experimental flag,
> x-misa-w is also a temporary flag that we need to remove once we fix
> write_misa().
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg05092.html
> 
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/cpu.c | 6 ++
>  target/riscv/cpu.h | 2 +-
>  target/riscv/csr.c | 2 +-
>  3 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 93b52b826c..1d637b1acd 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -1210,6 +1210,12 @@ static Property riscv_cpu_properties[] = {
>  
>  DEFINE_PROP_BOOL("rvv_ta_all_1s", RISCVCPU, cfg.rvv_ta_all_1s, false),
>  DEFINE_PROP_BOOL("rvv_ma_all_1s", RISCVCPU, cfg.rvv_ma_all_1s, false),
> +
> +/*
> + * write_misa() is marked as experimental for now so mark
> + * it with -x and default to 'false'.
> + */
> +DEFINE_PROP_BOOL("x-misa-w", RISCVCPU, cfg.misa_w, false),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> index 215423499e..9d3304bcda 100644
> --- a/target/riscv/cpu.h
> +++ b/target/riscv/cpu.h
> @@ -89,7 +89,6 @@ enum {
>  RISCV_FEATURE_MMU,
>  RISCV_FEATURE_PMP,
>  RISCV_FEATURE_EPMP,
> -RISCV_FEATURE_MISA,
>  RISCV_FEATURE_DEBUG
>  };
>  
> @@ -498,6 +497,7 @@ struct RISCVCPUConfig {
>  bool pmp;
>  bool epmp;
>  bool debug;
> +bool misa_w;
>  
>  bool short_isa_string;
>  };
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> index e149b453da..3cb8d2ffad 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -1329,7 +1329,7 @@ static RISCVException read_misa(CPURISCVState *env, int 
> csrno,
>  static RISCVException write_misa(CPURISCVState *env, int csrno,
>   target_ulong val)
>  {
> -if (!riscv_feature(env, RISCV_FEATURE_MISA)) {
> +if (!riscv_cpu_cfg(env)->misa_w) {
>  /* drop write to misa */
>  return RISCV_EXCP_NONE;
>  }
> -- 
> 2.39.2
> 
>

Reviewed-by: Andrew Jones

Re: [PATCH v3 5/6] meson: prefer 'sphinx-build' to 'sphinx-build-3'

2023-02-22 Thread Markus Armbruster

John Snow  writes:

> On Wed, Feb 22, 2023 at 2:15 AM Markus Armbruster  wrote:
>>
>> John Snow  writes:
>>
>> > On Tue, Feb 21, 2023, 1:50 AM Markus Armbruster  wrote:
>> >
>> >> John Snow  writes:
>> >>
>> >> > Once upon a time, "sphinx-build" on certain RPM platforms invoked
>> >> > specifically a Python 2.x version, while "sphinx-build-3" was a distro
>> >> > shim for the Python 3.x version.
>> >> >
>> >> > These days, none of our supported platforms utilize a 2.x version, so it
>> >> > should be safe to search for 'sphinx-build' prior to 'sphinx-build-3',
>> >> > which will prefer pip/venv installed versions of sphinx if they're
>> >> > available.
>> >> >
>> >> > This adds an extremely convenient ability to test document building
>> >> > ability in QEMU across multiple versions of Sphinx for the purposes of
>> >> > compatibility testing.
>> >> >
>> >> > Signed-off-by: John Snow 
>> >> > ---
>> >> >  docs/meson.build | 2 +-
>> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/docs/meson.build b/docs/meson.build
>> >> > index 9136fed3b73..906034f9a87 100644
>> >> > --- a/docs/meson.build
>> >> > +++ b/docs/meson.build
>> >> > @@ -1,5 +1,5 @@
>> >> >  if get_option('sphinx_build') == ''
>> >> > -  sphinx_build = find_program(['sphinx-build-3', 'sphinx-build'],
>> >> > +  sphinx_build = find_program(['sphinx-build', 'sphinx-build-3'],
>> >> >required: get_option('docs'))
>> >> >  else
>> >> >sphinx_build = find_program(get_option('sphinx_build'),
>> >>
>> >> Do we still need to check for sphinx-build-3?  Or asked differently, is
>> >> there any supported build host that provides only sphinx-build-3?
>> >>
>> >
>> > Yes, modern Fedora still uses "sphinx-build-3" as the name in /usr/bin for
>> > the rpm-packaged version of sphinx.
>>
>> For what it's worth, python3-sphinx-5.0.2-2.fc37.noarch provides
>>
>> /usr/bin/sphinx-build
>> /usr/bin/sphinx-build-3
>> /usr/bin/sphinx-build-3.11
>>
>> where the latter two are symbolic links to the first.  No need to check
>> for sphinx-build-3 here.
>
> Oh, I see. I guess it should be fine, but only if we explicitly drop
> support for the 3.6 version that comes with CentOS. I'm not entirely
> sure if "sphinx-build-3" is used anywhere else, I *think* it's just an
> rpm-ism.

I can see just two reasons for trying sphinx-build-3:

1. sphinx-build does not exist.

2. sphinx-build exists, but uses Python 2, which doesn't work with our
   Sphinx extension.

The commit message seems to claim it's not 2.

So, what is it?

Re: [PATCH v3 5/6] meson: prefer 'sphinx-build' to 'sphinx-build-3'

2023-02-22 Thread John Snow

On Wed, Feb 22, 2023 at 2:15 AM Markus Armbruster  wrote:
>
> John Snow  writes:
>
> > On Tue, Feb 21, 2023, 1:50 AM Markus Armbruster  wrote:
> >
> >> John Snow  writes:
> >>
> >> > Once upon a time, "sphinx-build" on certain RPM platforms invoked
> >> > specifically a Python 2.x version, while "sphinx-build-3" was a distro
> >> > shim for the Python 3.x version.
> >> >
> >> > These days, none of our supported platforms utilize a 2.x version, so it
> >> > should be safe to search for 'sphinx-build' prior to 'sphinx-build-3',
> >> > which will prefer pip/venv installed versions of sphinx if they're
> >> > available.
> >> >
> >> > This adds an extremely convenient ability to test document building
> >> > ability in QEMU across multiple versions of Sphinx for the purposes of
> >> > compatibility testing.
> >> >
> >> > Signed-off-by: John Snow 
> >> > ---
> >> >  docs/meson.build | 2 +-
> >> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >> >
> >> > diff --git a/docs/meson.build b/docs/meson.build
> >> > index 9136fed3b73..906034f9a87 100644
> >> > --- a/docs/meson.build
> >> > +++ b/docs/meson.build
> >> > @@ -1,5 +1,5 @@
> >> >  if get_option('sphinx_build') == ''
> >> > -  sphinx_build = find_program(['sphinx-build-3', 'sphinx-build'],
> >> > +  sphinx_build = find_program(['sphinx-build', 'sphinx-build-3'],
> >> >required: get_option('docs'))
> >> >  else
> >> >sphinx_build = find_program(get_option('sphinx_build'),
> >>
> >> Do we still need to check for sphinx-build-3?  Or asked differently, is
> >> there any supported build host that provides only sphinx-build-3?
> >>
> >
> > Yes, modern Fedora still uses "sphinx-build-3" as the name in /usr/bin for
> > the rpm-packaged version of sphinx.
>
> For what it's worth, python3-sphinx-5.0.2-2.fc37.noarch provides
>
> /usr/bin/sphinx-build
> /usr/bin/sphinx-build-3
> /usr/bin/sphinx-build-3.11
>
> where the latter two are symbolic links to the first.  No need to check
> for sphinx-build-3 here.

Oh, I see. I guess it should be fine, but only if we explicitly drop
support for the 3.6 version that comes with CentOS. I'm not entirely
sure if "sphinx-build-3" is used anywhere else, I *think* it's just an
rpm-ism.

[PULL 0/2] Python patches

2023-02-22 Thread John Snow

The following changes since commit 79b677d658d3d35e1e776826ac4abb28cdce69b8:

  Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into staging 
(2023-02-21 11:28:31 +)

are available in the Git repository at:

  https://gitlab.com/jsnow/qemu.git tags/python-pull-request

for you to fetch changes up to 6832189fd791622c30e7bbe3a12b76be14dc1158:

  python: drop pipenv (2023-02-22 23:35:03 -0500)


Python

Only minor testing updates.



John Snow (2):
  python: support pylint 2.16
  python: drop pipenv

 python/README.rst |   3 -
 .gitlab-ci.d/static_checks.yml|   4 +-
 python/.gitignore |   4 +-
 python/Makefile   |  53 ++-
 python/Pipfile|  13 -
 python/Pipfile.lock   | 347 --
 python/qemu/qmp/protocol.py   |   2 +-
 python/qemu/qmp/qmp_client.py |   2 +-
 python/qemu/utils/qemu_ga_client.py   |   6 +-
 python/setup.cfg  |   4 +-
 python/tests/minreqs.txt  |  45 +++
 tests/docker/dockerfiles/python.docker|   1 -
 tests/qemu-iotests/iotests.py |   4 +-
 .../tests/migrate-bitmaps-postcopy-test   |   2 +-
 14 files changed, 94 insertions(+), 396 deletions(-)
 delete mode 100644 python/Pipfile
 delete mode 100644 python/Pipfile.lock
 create mode 100644 python/tests/minreqs.txt

-- 
2.39.0

[PULL 2/2] python: drop pipenv

2023-02-22 Thread John Snow

The pipenv tool was nice in theory, but in practice it's just too hard
to update selectively, and it makes using it a pain. The qemu.qmp repo
dropped pipenv support a while back and it's been functioning just fine,
so I'm backporting that change here to qemu.git.

Signed-off-by: John Snow 
Message-id: 20230210003147.1309376-3-js...@redhat.com
Signed-off-by: John Snow 
---
 python/README.rst  |   3 -
 .gitlab-ci.d/static_checks.yml |   4 +-
 python/.gitignore  |   4 +-
 python/Makefile|  53 ++--
 python/Pipfile |  13 -
 python/Pipfile.lock| 347 -
 python/setup.cfg   |   4 +-
 python/tests/minreqs.txt   |  45 
 tests/docker/dockerfiles/python.docker |   1 -
 9 files changed, 86 insertions(+), 388 deletions(-)
 delete mode 100644 python/Pipfile
 delete mode 100644 python/Pipfile.lock
 create mode 100644 python/tests/minreqs.txt

diff --git a/python/README.rst b/python/README.rst
index 9c1fceaee73..d62e71528d2 100644
--- a/python/README.rst
+++ b/python/README.rst
@@ -77,9 +77,6 @@ Files in this directory
 - ``MANIFEST.in`` is read by python setuptools, it specifies additional files
   that should be included by a source distribution.
 - ``PACKAGE.rst`` is used as the README file that is visible on PyPI.org.
-- ``Pipfile`` is used by Pipenv to generate ``Pipfile.lock``.
-- ``Pipfile.lock`` is a set of pinned package dependencies that this package
-  is tested under in our CI suite. It is used by ``make check-pipenv``.
 - ``README.rst`` you are here!
 - ``VERSION`` contains the PEP-440 compliant version used to describe
   this package; it is referenced by ``setup.cfg``.
diff --git a/.gitlab-ci.d/static_checks.yml b/.gitlab-ci.d/static_checks.yml
index 289ad1359e3..b4cbdbce2ab 100644
--- a/.gitlab-ci.d/static_checks.yml
+++ b/.gitlab-ci.d/static_checks.yml
@@ -23,12 +23,12 @@ check-dco:
   before_script:
 - apk -U add git
 
-check-python-pipenv:
+check-python-minreqs:
   extends: .base_job_template
   stage: test
   image: $CI_REGISTRY_IMAGE/qemu/python:latest
   script:
-- make -C python check-pipenv
+- make -C python check-minreqs
   variables:
 GIT_DEPTH: 1
   needs:
diff --git a/python/.gitignore b/python/.gitignore
index 904f324bb11..c3ceb1ca0ab 100644
--- a/python/.gitignore
+++ b/python/.gitignore
@@ -11,8 +11,8 @@ qemu.egg-info/
 .idea/
 .vscode/
 
-# virtual environments (pipenv et al)
-.venv/
+# virtual environments
+.min-venv/
 .tox/
 .dev-venv/
 
diff --git a/python/Makefile b/python/Makefile
index b170708398a..c5bd6ff83ac 100644
--- a/python/Makefile
+++ b/python/Makefile
@@ -1,15 +1,16 @@
 QEMU_VENV_DIR=.dev-venv
+QEMU_MINVENV_DIR=.min-venv
 QEMU_TOX_EXTRA_ARGS ?=
 
 .PHONY: help
 help:
@echo "python packaging help:"
@echo ""
-   @echo "make check-pipenv:"
-   @echo "Run tests in pipenv's virtual environment."
+   @echo "make check-minreqs:"
+   @echo "Run tests in the minreqs virtual environment."
@echo "These tests use the oldest dependencies."
-   @echo "Requires: Python 3.6 and pipenv."
-   @echo "Hint (Fedora): 'sudo dnf install python3.6 pipenv'"
+   @echo "Requires: Python 3.6"
+   @echo "Hint (Fedora): 'sudo dnf install python3.6'"
@echo ""
@echo "make check-tox:"
@echo "Run tests against multiple python versions."
@@ -33,8 +34,8 @@ help:
@echo "and install the qemu package in editable mode."
@echo "(Can be used in or outside of a venv.)"
@echo ""
-   @echo "make pipenv"
-   @echo "Creates pipenv's virtual environment (.venv)"
+   @echo "make min-venv"
+   @echo "Creates the minreqs virtual environment 
($(QEMU_MINVENV_DIR))"
@echo ""
@echo "make dev-venv"
@echo "Creates a simple venv for check-dev. ($(QEMU_VENV_DIR))"
@@ -43,21 +44,38 @@ help:
@echo "Remove package build output."
@echo ""
@echo "make distclean:"
-   @echo "remove pipenv/venv files, qemu package forwarder,"
+   @echo "remove venv files, qemu package forwarder,"
@echo "built distribution files, and everything from 'make clean'."
@echo ""
@echo -e "Have a nice day ^_^\n"
 
-.PHONY: pipenv
-pipenv: .venv
-.venv: Pipfile.lock
-   @PIPENV_VENV_IN_PROJECT=1 pipenv sync --dev --keep-outdated
-   rm -f pyproject.toml
-   @touch .venv
+.PHONY: pipenv check-pipenv
+pipenv check-pipenv:
+   @echo "pipenv was dropped; try 'make check-minreqs' or 'make min-venv'"
+   @exit 1
 
-.PHONY: check-pipenv
-check-pipenv: pipenv
-   @pipenv run make check
+.PHONY: min-venv
+min-venv: $(QEMU_MINVENV_DIR) $(QEMU_MINVENV_DIR)/bin/activate
+$(QEMU_MINVENV_DIR) $(QEMU_MINVENV_DIR)/bin/activate: setup.cfg 
tests/minreqs.txt
+   @echo "VENV $(QEMU_MINVENV_DIR)"
+   @py

[PULL 1/2] python: support pylint 2.16

2023-02-22 Thread John Snow

Pylint 2.16 adds a few new checks that cause the optional check-tox CI
job to fail.

1. The superfluous-parens check seems to be a bit more aggressive,
2. broad-exception-raised is new; it discourages "raise Exception".

Fix these minor issues and turn the lights green.

Signed-off-by: John Snow 
Reviewed-by: Philippe Mathieu-Daudé 
Reviewed-by: Beraldo Leal 
Message-id: 20230210003147.1309376-2-js...@redhat.com
Signed-off-by: John Snow 
---
 python/qemu/qmp/protocol.py| 2 +-
 python/qemu/qmp/qmp_client.py  | 2 +-
 python/qemu/utils/qemu_ga_client.py| 6 +++---
 tests/qemu-iotests/iotests.py  | 4 ++--
 tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test | 2 +-
 5 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/python/qemu/qmp/protocol.py b/python/qemu/qmp/protocol.py
index 6d3d739daa7..22e60298d28 100644
--- a/python/qemu/qmp/protocol.py
+++ b/python/qemu/qmp/protocol.py
@@ -207,7 +207,7 @@ class AsyncProtocol(Generic[T]):
 logger = logging.getLogger(__name__)
 
 # Maximum allowable size of read buffer
-_limit = (64 * 1024)
+_limit = 64 * 1024
 
 # -
 # Section: Public interface
diff --git a/python/qemu/qmp/qmp_client.py b/python/qemu/qmp/qmp_client.py
index b5772e7f32b..9d73ae6e7ad 100644
--- a/python/qemu/qmp/qmp_client.py
+++ b/python/qemu/qmp/qmp_client.py
@@ -198,7 +198,7 @@ async def run(self, address='/tmp/qemu.socket'):
 logger = logging.getLogger(__name__)
 
 # Read buffer limit; 10MB like libvirt default
-_limit = (10 * 1024 * 1024)
+_limit = 10 * 1024 * 1024
 
 # Type alias for pending execute() result items
 _PendingT = Union[Message, ExecInterruptedError]
diff --git a/python/qemu/utils/qemu_ga_client.py 
b/python/qemu/utils/qemu_ga_client.py
index 8c38a7ac9c0..d8411bb2d0b 100644
--- a/python/qemu/utils/qemu_ga_client.py
+++ b/python/qemu/utils/qemu_ga_client.py
@@ -155,7 +155,7 @@ def ping(self, timeout: Optional[float]) -> bool:
 
 def fsfreeze(self, cmd: str) -> object:
 if cmd not in ['status', 'freeze', 'thaw']:
-raise Exception('Invalid command: ' + cmd)
+raise ValueError('Invalid command: ' + cmd)
 # Can be int (freeze, thaw) or GuestFsfreezeStatus (status)
 return getattr(self.qga, 'fsfreeze' + '_' + cmd)()
 
@@ -167,7 +167,7 @@ def fstrim(self, minimum: int) -> Dict[str, object]:
 
 def suspend(self, mode: str) -> None:
 if mode not in ['disk', 'ram', 'hybrid']:
-raise Exception('Invalid mode: ' + mode)
+raise ValueError('Invalid mode: ' + mode)
 
 try:
 getattr(self.qga, 'suspend' + '_' + mode)()
@@ -178,7 +178,7 @@ def suspend(self, mode: str) -> None:
 
 def shutdown(self, mode: str = 'powerdown') -> None:
 if mode not in ['powerdown', 'halt', 'reboot']:
-raise Exception('Invalid mode: ' + mode)
+raise ValueError('Invalid mode: ' + mode)
 
 try:
 self.qga.shutdown(mode=mode)
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 94aeb3f3b20..3e82c634cfe 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -720,7 +720,7 @@ def __exit__(self, exc_type, value, traceback):
 signal.setitimer(signal.ITIMER_REAL, 0)
 return False
 def timeout(self, signum, frame):
-raise Exception(self.errmsg)
+raise TimeoutError(self.errmsg)
 
 def file_pattern(name):
 return "{0}-{1}".format(os.getpid(), name)
@@ -804,7 +804,7 @@ def remote_filename(path):
 elif imgproto == 'ssh':
 return "ssh://%s@127.0.0.1:22%s" % (os.environ.get('USER'), path)
 else:
-raise Exception("Protocol %s not supported" % (imgproto))
+raise ValueError("Protocol %s not supported" % (imgproto))
 
 class VM(qtest.QEMUQtestMachine):
 '''A QEMU VM'''
diff --git a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test 
b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
index fc9c4b4ef41..dda55fad284 100755
--- a/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
+++ b/tests/qemu-iotests/tests/migrate-bitmaps-postcopy-test
@@ -84,7 +84,7 @@ class TestDirtyBitmapPostcopyMigration(iotests.QMPTestCase):
 e['vm'] = 'SRC'
 for e in self.vm_b_events:
 e['vm'] = 'DST'
-events = (self.vm_a_events + self.vm_b_events)
+events = self.vm_a_events + self.vm_b_events
 events = [(e['timestamp']['seconds'],
e['timestamp']['microseconds'],
e['vm'],
-- 
2.39.0

Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs in CPUID.04H

2023-02-22 Thread Xiaoyao Li


On 2/22/2023 2:37 PM, Zhao Liu wrote:

Hi Xiaoyao,

Thanks, I've spent some time thinking about it here.

On Mon, Feb 20, 2023 at 02:59:20PM +0800, Xiaoyao Li wrote:

Date: Mon, 20 Feb 2023 14:59:20 +0800
From: Xiaoyao Li 
Subject: Re: [PATCH RESEND 04/18] i386/cpu: Fix number of addressable IDs
  in CPUID.04H

On 2/13/2023 5:36 PM, Zhao Liu wrote:

For i-cache and d-cache, the maximum IDs for CPUs sharing cache (
CPUID.04H.00H:EAX[bits 25:14] and CPUID.04H.01H:EAX[bits 25:14]) are
both 0, and this means i-cache and d-cache are shared in the SMT level.
This is correct if there's single thread per core, but is wrong for the
hyper threading case (one core contains multiple threads) since the
i-cache and d-cache are shared in the core level other than SMT level.

Therefore, in order to be compatible with both multi-threaded and
single-threaded situations, we should set i-cache and d-cache be shared
at the core level by default.


It's true for VM only when the exactly HW topology is configured to VM.
i.e., two virtual LPs of one virtual CORE are pinned to two physical LPs
that of one physical CORE.


Yes, in this case, host and guest has the same topology, and their
topology can match.


Otherwise it's incorrect for VM.


My understanding here is that what we do in QEMU is to create
self-consistent CPU topology and cache topology for the guest.

If the VM topology is self-consistent and emulated to be almost
identical to the real machine, then the emulation in QEMU is correct,
right? ;-)


Real machine tells two threads in the same CORE share the L1 cahche via 
CUPID because it's the fact and it is how exactly hardware resource lay 
out. However, for VM, when you tell the same thing (two threads share 
the L1 cache), is it true for vcpus?


The target is to emulate things correctly, not emulate it identical to 
real machine. In fact, for these shared resources, it's mostly 
infeasible to emulate correctly if not pinning vcpus to physical LPs.




for example. given a VM of 4 threads and 2 cores. If not pinning the 4
threads to physical 4 LPs of 2 CORES. It's likely each vcpu running on a LP
of different physical cores.


Thanks for bringing this up, this is worth discussing.

I looked into it and found that the specific scheduling policy for the
vCPU actually depends on the host setting. For example, (IIUC) if host

enables core scheduling, then host will schedule the vCPU on the SMT
threads of same core.

Also, to explore the original purpose of the "per thread" i/d cache
topology, I have retraced its history.

The related commit should be in '09, which is 400281a (set CPUID bits
to present cores and threads topology). In this commit, the
multithreading cache topology is added in CPUID.04H. In particular, here
it set the L2 cache level to per core, but it did not change the level of
L1 (i/d cache), that is, L1 still remains per thread.

I think that here is the problem, L1 should also be per core in
multithreading case. (the fix for this patch is worth it?)

Another thing we can refer to is that AMD's i/d cache topology is per
core rather than per thread (different CPUID leaf than intel): In
encode_cache_cpuid801d() (target/i386/cpu.c), i/d cache and L2
are encoded as core level in EAX. They set up the per core supposedly
to emulate the L1 topology of the real machine as well.

So, I guess this example is "unintentionally" benefiting from the
"per thread" level of i/d cache.

What do you think?


So no vcpu shares L1i/L1d cache at core level.


Yes. The scheduling of host is not guaranteed, and workload balance
policies in various scenarios and some security mitigation ways may
break the delicate balance we have carefully set up.

Perhaps another way is to also add a new command "x-l1-cache-topo" (like
[1] did for L2) that can adjust the i/d cache level from core to smt to
benefit cases like this.

[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg03201.html

Thanks,
Zhao

Re: [RFC v5 0/3] migration: reduce time of loading non-iterable vmstate

2023-02-22 Thread Chuang Xu


Hi, Peter

On 2023/2/22 下午11:57, Peter Xu wrote:

On Wed, Feb 22, 2023 at 02:27:55PM +0800, Chuang Xu wrote:

Hi, Peter

Hi, Chuang,


Note that as I mentioned in the comment, we temporarily replace this value
to prevent commit() and address_space_to_flatview() call each other recursively,
and eventually stack overflow.

Sorry to have overlooked that part.  IMHO here it's not about the depth,
but rather that we don't even need any RCU protection when updating
ioeventfds because we exclusively own the FlatView* too there.

I wanted to describe what I had in mind but instead I figured a patch may
be needed to be accurate (with some cleanups alongside), hence attached.
IIUC it can work with what I suggested before without fiddling with depth.
Please have a look.  The patch should apply cleanly to master branch so if
it works it can be your 1st patch too.

PS: Paolo - I know I asked this before, but it'll be good to have your
review comment on anything above.

Thanks,


Here are two problems I can see:

1. It is inappropriate to use assert(qemu_mutex_iothread_locked()
&& !memory_region_update_pending) in update_ioeventfds().

For example, when entering commit(), if memory_region_update_pending
is true, the assertion will be triggered immediately when update_ioeventfds
is called.

2. The problem of stack overflow has not been solved. There are
too many places where address_space_to_flatview() may be called.

Here are another coredump stack:

#8  0x55a3a769ed85 in memory_region_transaction_commit_force () at 
../softmmu/memory.c:1154
#9  0x55a3a769fd75 in address_space_to_flatview (as=0x55a3a7ede180 
) at 
/data00/migration/qemu-open/include/exec/memory.h:1118
#10 address_space_update_topology_pass (as=as@entry=0x55a3a7ede180 
, old_view=old_view@entry=0x55a3a9d44990, 
new_view=new_view@entry=0x55a3d6837390,
adding=adding@entry=false) at ../softmmu/memory.c:955
#11 0x55a3a76a007c in address_space_set_flatview (as=as@entry=0x55a3a7ede180 
) at ../softmmu/memory.c:1062
#12 0x55a3a769e870 in address_space_update_flatview_all () at 
../softmmu/memory.c:1107
#13 0x55a3a769ed85 in memory_region_transaction_commit_force () at 
../softmmu/memory.c:1154
#14 0x55a3a769fd75 in address_space_to_flatview (as=0x55a3a7ede180 
) at 
/data00/migration/qemu-open/include/exec/memory.h:1118
#15 address_space_update_topology_pass (as=as@entry=0x55a3a7ede180 
, old_view=old_view@entry=0x55a3a9d44990, 
new_view=new_view@entry=0x55a3d67f8d90,
adding=adding@entry=false) at ../softmmu/memory.c:955
#16 0x55a3a76a007c in address_space_set_flatview (as=as@entry=0x55a3a7ede180 
) at ../softmmu/memory.c:1062
#17 0x55a3a769e870 in address_space_update_flatview_all () at 
../softmmu/memory.c:1107
#18 0x55a3a769ed85 in memory_region_transaction_commit_force () at 
../softmmu/memory.c:1154
#19 0x55a3a769fd75 in address_space_to_flatview (as=0x55a3a7ede180 
) at 
/data00/migration/qemu-open/include/exec/memory.h:1118
#20 address_space_update_topology_pass (as=as@entry=0x55a3a7ede180 
, old_view=old_view@entry=0x55a3a9d44990, 
new_view=new_view@entry=0x55a3d67ba790,
adding=adding@entry=false) at ../softmmu/memory.c:955
#21 0x55a3a76a007c in address_space_set_flatview (as=as@entry=0x55a3a7ede180 
) at ../softmmu/memory.c:1062
#22 0x55a3a769e870 in address_space_update_flatview_all () at 
../softmmu/memory.c:1107
#23 0x55a3a769ed85 in memory_region_transaction_commit_force () at 
../softmmu/memory.c:1154

And this may not be the only case where stack overflow occurs.
Thus, changing the depth value is the safest way I think.

Thanks!

Re: [PATCH 2/3] intel-iommu: fail DEVIOTLB_UNMAP without dt mode

2023-02-22 Thread Jason Wang

On Fri, Dec 2, 2022 at 12:03 AM Peter Xu  wrote:
>
> On Tue, Nov 29, 2022 at 04:10:36PM +0800, Jason Wang wrote:
> > Without dt mode, device IOTLB notifier won't work since guest won't
> > send device IOTLB invalidation descriptor in this case. Let's fail
> > early instead of misbehaving silently.
> >
> > Signed-off-by: Jason Wang 
> > ---
> >  hw/i386/intel_iommu.c | 8 
> >  1 file changed, 8 insertions(+)
> >
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 9143376677..d025ef2873 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -3179,6 +3179,7 @@ static int 
> > vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
> >  {
> >  VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
> >  IntelIOMMUState *s = vtd_as->iommu_state;
> > +X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
> >
> >  /* TODO: add support for VFIO and vhost users */
> >  if (s->snoop_control) {
> > @@ -3193,6 +3194,13 @@ static int 
> > vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
> >   PCI_FUNC(vtd_as->devfn));
> >  return -ENOTSUP;
> >  }
> > +if (!x86_iommu->dt_supported && (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP)) 
> > {
> > +error_setg_errno(errp, ENOTSUP,
> > + "device %02x.%02x.%x requires device IOTLB mode",
> > + pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
> > + PCI_FUNC(vtd_as->devfn));
> > +return -ENOTSUP;
> > +}
>
> While my r-b holds.. let's also do this for amd-iommu in the same patch?
> dt never supported there, so we can fail as long as DEVIOTLB registered.

Looks like there's one implementation:

Per spec:

""
The INVALIDATE_IOTLB_PAGES command is only present in IOMMU
implementations that support remote IOTLB caching of translations (see
Capability Offset 00h[IotlbSup]). This command instructs the specified
device to invalidate the given range of addresses in its IOTLB. The
size of the invalidate command is determined by the S bit and the
address.
""

And it has one implementation (though buggy) iommu_inval_iotlb() which
doesn't trigger any DEVIOTLB_UNMAP notifier.

We can leave this for the future.

(Last time I tried amd-iommu it didn't even boot).

Thanks

>
> >
> >  /* Update per-address-space notifier flags */
> >  vtd_as->notifier_flags = new;
> > --
> > 2.25.1
> >
>
> --
> Peter Xu
>

Re: [PATCH v2 09/13] vdpa net: block migration if the device has CVQ

2023-02-22 Thread Jason Wang




在 2023/2/22 15:28, Eugenio Perez Martin 写道:

On Wed, Feb 22, 2023 at 5:01 AM Jason Wang  wrote:


在 2023/2/8 17:42, Eugenio Pérez 写道:

Devices with CVQ needs to migrate state beyond vq state.  Leaving this
to future series.


I may miss something but what is missed to support CVQ/MQ?


To restore all the device state set by CVQ in the migration source
(MAC, MQ, ...) before data vqs start. We don't have a reliable way to
not start data vqs until the device [1].

Thanks!

[1] https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02652.html



Right. It might be mention this defect in either the change log or 
somewhere in the code as a comment.


(Btw, I think we should fix those vDPA drivers).

Thanks





Thanks



Signed-off-by: Eugenio Pérez 
---
   net/vhost-vdpa.c | 6 ++
   1 file changed, 6 insertions(+)

diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
index bca13f97fd..309861e56c 100644
--- a/net/vhost-vdpa.c
+++ b/net/vhost-vdpa.c
@@ -955,11 +955,17 @@ int net_init_vhost_vdpa(const Netdev *netdev, const char 
*name,
   }

   if (has_cvq) {
+VhostVDPAState *s;
+
   nc = net_vhost_vdpa_init(peer, TYPE_VHOST_VDPA, name,
vdpa_device_fd, i, 1, false,
opts->x_svq, iova_range);
   if (!nc)
   goto err;
+
+s = DO_UPCAST(VhostVDPAState, nc, nc);
+error_setg(&s->vhost_vdpa.dev->migration_blocker,
+   "net vdpa cannot migrate with MQ feature");
   }

   return 0;

Re: [PATCH v2 11/13] vdpa: block migration if dev does not have _F_SUSPEND

2023-02-22 Thread Jason Wang




在 2023/2/22 22:25, Eugenio Perez Martin 写道:

On Wed, Feb 22, 2023 at 5:05 AM Jason Wang  wrote:


在 2023/2/8 17:42, Eugenio Pérez 写道:

Next patches enable devices to be migrated even if vdpa netdev has not
been started with x-svq. However, not all devices are migratable, so we
need to block migration if we detect that.

Block vhost-vdpa device migration if it does not offer _F_SUSPEND and it
has not been started with x-svq.

Signed-off-by: Eugenio Pérez 
---
   hw/virtio/vhost-vdpa.c | 21 +
   1 file changed, 21 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 84a6b9690b..9d30cf9b3c 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -442,6 +442,27 @@ static int vhost_vdpa_init(struct vhost_dev *dev, void 
*opaque, Error **errp)
   return 0;
   }

+/*
+ * If dev->shadow_vqs_enabled at initialization that means the device has
+ * been started with x-svq=on, so don't block migration
+ */
+if (dev->migration_blocker == NULL && !v->shadow_vqs_enabled) {
+uint64_t backend_features;
+
+/* We don't have dev->backend_features yet */
+ret = vhost_vdpa_call(dev, VHOST_GET_BACKEND_FEATURES,
+  &backend_features);
+if (unlikely(ret)) {
+error_setg_errno(errp, -ret, "Could not get backend features");
+return ret;
+}
+
+if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_SUSPEND))) {
+error_setg(&dev->migration_blocker,
+"vhost-vdpa backend lacks VHOST_BACKEND_F_SUSPEND feature.");
+}


I wonder why not let the device to decide? For networking device, we can
live without suspend probably.


Right, but how can we know if this is a net device in init? I don't
think a switch (vhost_vdpa_get_device_id(dev)) is elegant.



I meant the caller of vhost_vdpa_init() which is net_init_vhost_vdpa().

Thanks




If the parent device does not need to be suspended i'd go with
exposing a suspend ioctl but do nothing in the parent device. After
that, it could even choose to return an error for GET_VRING_BASE.

If we want to implement it as a fallback in qemu, I'd go for
implementing it on top of this series. There are a few operations we
could move to a device-kind specific ops.

Would it make sense to you?

Thanks!



Thanks



+}
+
   /*
* Similar to VFIO, we end up pinning all guest memory and have to
* disable discarding of RAM.

Re: [PATCH v7 03/10] target/riscv: allow MISA writes as experimental

2023-02-22 Thread liweiwei




On 2023/2/23 02:51, Daniel Henrique Barboza wrote:

At this moment, and apparently since ever, we have no way of enabling
RISCV_FEATURE_MISA. This means that all the code from write_misa(), all
the nuts and bolts that handles how to properly write this CSR, has
always been a no-op as well because write_misa() will always exit
earlier.

This seems to be benign in the majority of cases. Booting an Ubuntu
'virt' guest and logging all the calls to 'write_misa' shows that no
writes to MISA CSR was attempted. Writing MISA, i.e. enabling/disabling
RISC-V extensions after the machine is powered on, seems to be a niche
use.

After discussions in the mailing list, most notably in [1], we reached
the consensus that this code is not suited to be exposed to users
because it's not well tested, but at the same time removing it is a bit
extreme because we would like to fix it, and it's easier to do so with
the code available to use instead of fetching it from git log.

The approach taken here is to get rid of RISCV_FEATURE_MISA altogether
and use a new experimental flag called x-misa-w. The default value is
false, meaning that we're keeping the existing behavior of doing nothing
if a write_misa() is attempted. As with any existing experimental flag,
x-misa-w is also a temporary flag that we need to remove once we fix
write_misa().

[1] https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg05092.html

Signed-off-by: Daniel Henrique Barboza 


Acceptable to me.

Reviewed-by: Weiwei Li 

Weiwei Li


---
  target/riscv/cpu.c | 6 ++
  target/riscv/cpu.h | 2 +-
  target/riscv/csr.c | 2 +-
  3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 93b52b826c..1d637b1acd 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -1210,6 +1210,12 @@ static Property riscv_cpu_properties[] = {
  
  DEFINE_PROP_BOOL("rvv_ta_all_1s", RISCVCPU, cfg.rvv_ta_all_1s, false),

  DEFINE_PROP_BOOL("rvv_ma_all_1s", RISCVCPU, cfg.rvv_ma_all_1s, false),
+
+/*
+ * write_misa() is marked as experimental for now so mark
+ * it with -x and default to 'false'.
+ */
+DEFINE_PROP_BOOL("x-misa-w", RISCVCPU, cfg.misa_w, false),
  DEFINE_PROP_END_OF_LIST(),
  };
  
diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h

index 215423499e..9d3304bcda 100644
--- a/target/riscv/cpu.h
+++ b/target/riscv/cpu.h
@@ -89,7 +89,6 @@ enum {
  RISCV_FEATURE_MMU,
  RISCV_FEATURE_PMP,
  RISCV_FEATURE_EPMP,
-RISCV_FEATURE_MISA,
  RISCV_FEATURE_DEBUG
  };
  
@@ -498,6 +497,7 @@ struct RISCVCPUConfig {

  bool pmp;
  bool epmp;
  bool debug;
+bool misa_w;
  
  bool short_isa_string;

  };
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index e149b453da..3cb8d2ffad 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -1329,7 +1329,7 @@ static RISCVException read_misa(CPURISCVState *env, int 
csrno,
  static RISCVException write_misa(CPURISCVState *env, int csrno,
   target_ulong val)
  {
-if (!riscv_feature(env, RISCV_FEATURE_MISA)) {
+if (!riscv_cpu_cfg(env)->misa_w) {
  /* drop write to misa */
  return RISCV_EXCP_NONE;
  }

Re: [PATCH v2 17/20] vfio/common: Support device dirty page tracking with vIOMMU

2023-02-22 Thread Jason Gunthorpe

On Wed, Feb 22, 2023 at 04:34:39PM -0700, Alex Williamson wrote:
> > +/*
> > + * With vIOMMU we try to track the entire IOVA space. As the IOVA 
> > space can
> > + * be rather big, devices might not be able to track it due to HW
> > + * limitations. In that case:
> > + * (1) Retry tracking a smaller part of the IOVA space.
> > + * (2) Retry tracking a range in the size of the physical memory.
> 
> This looks really sketchy, why do we think there's a "good enough"
> value here?  If we get it wrong, the device potentially has access to
> IOVA space that we're not tracking, right?

The idea was the untracked range becomes permanently dirty, so at
worst this means the migration never converges.

#2 is the presumption that the guest is using an identity map.

> I'd think the only viable fallback if the vIOMMU doesn't report its max
> IOVA is the full 64-bit address space, otherwise it seems like we need
> to add a migration blocker.

This is basically saying vIOMMU doesn't work with migration, and we've
heard that this isn't OK. There are cases where vIOMMU is on but the
guest always uses identity maps. eg for virtual interrupt remapping.

We also have future problems that nested translation is incompatible
with device dirty tracking..

Jason

Re: [PATCH v2 11/20] vfio/common: Add device dirty page tracking start/stop

2023-02-22 Thread Jason Gunthorpe

On Wed, Feb 22, 2023 at 03:40:43PM -0700, Alex Williamson wrote:
> > +/*
> > + * DMA logging uAPI guarantees to support at least num_ranges that 
> > fits into
> > + * a single host kernel page. To be on the safe side, use this as a 
> > limit
> > + * from which to merge to a single range.
> > + */
> > +max_ranges = qemu_real_host_page_size() / sizeof(*ranges);
> > +cur_ranges = iova_tree_nnodes(container->mappings);
> > +control->num_ranges = (cur_ranges <= max_ranges) ? cur_ranges : 1;
> 
> This makes me suspicious that we're implementing to the characteristics
> of a specific device rather than strictly to the vfio migration API.
> Are we just trying to avoid the error handling to support the try and
> fall back to a single range behavior?

This was what we agreed to when making the kernel patches. Userspace
is restricted to send one page of range list to the kernel, and the
kernel will always adjust that to whatever smaller list the device needs.

We added this limit only because we don't want to have a way for
userspace to consume a lot of kernel memory.

See LOG_MAX_RANGES in vfio_main.c

If qemu is viommu mode and it has a huge number of ranges then it must
cut it down before passing things to the kernel.

Jason

Re: [RFC PATCH 0/2] Add flag as THP allocation hint for memfd_restricted() syscall

2023-02-22 Thread Ackerley Tng


Yuan Yao  writes:


On Sat, Feb 18, 2023 at 12:43:00AM +, Ackerley Tng wrote:

Hello,



This patchset builds upon the memfd_restricted() system call that has
been discussed in the ‘KVM: mm: fd-based approach for supporting KVM’
patch series, at
https://lore.kernel.org/lkml/20221202061347.1070246-1-chao.p.p...@linux.intel.com/T/#m7e944d7892afdd1d62a03a287bd488c56e377b0c



The tree can be found at:
https://github.com/googleprodkernel/linux-cc/tree/restrictedmem-rmfd-hugepage



Following the RFC to provide mount for memfd_restricted() syscall at
https://lore.kernel.org/lkml/cover.1676507663.git.ackerley...@google.com/T/#u,
this patchset adds the RMFD_HUGEPAGE flag to the memfd_restricted()
syscall, which will hint the kernel to use Transparent HugePages to
back restrictedmem pages.



This supplements the interface proposed earlier, which requires the
creation of a tmpfs mount to be passed to memfd_restricted(), with a
more direct per-file hint.



Dependencies:



+ Sean’s iteration of the ‘KVM: mm: fd-based approach for supporting
   KVM’ patch series at
   https://github.com/sean-jc/linux/tree/x86/upm_base_support
+ Proposed fix for restrictedmem_getattr() as mentioned on the mailing
   list at

https://lore.kernel.org/lkml/diqzzga0fv96@ackerleytng-cloudtop-sg.c.googlers.com/

+ Hugh’s patch:

https://lore.kernel.org/lkml/c140f56a-1aa3-f7ae-b7d1-93da7d5a3...@google.com/,

   which provides functionality in shmem that reads the VM_HUGEPAGE
   flag in key functions shmem_is_huge() and shmem_get_inode()



Will Hugh's patch be merged into 6.3 ? I didn't find it in 6.2-rc8.
IMHO this patch won't work without Hugh's patch, or at least need
another way, e.g. HMEM_SB(inode->i_sb)->huge.



Hugh's patch is still pending discussion and may not be merged so
soon. These patches will not work without Hugh's patch.

I would like to understand what the community thinks of the proposed
interface (RMFD_HUGEPAGE flag, passed to the memfd_restricted()
syscall). If this interface is favorably received, we can definitely
find another way for shmem to support this interface.

If I understand correctly, SHMEM_SB(inode->i_sb)->huge checks the state
of hugepage-ness for the superblock. Since the proposed interface will
only affect a single file, we will need something closer to

bool shmem_is_huge(struct vm_area_struct *vma, struct inode *inode,
   pgoff_t index, bool shmem_huge_force)
{
...

if (SHMEM_I(inode)->flags & VM_HUGEPAGE)
return true;

...
}

from Hugh's patch.

Re: [RFC PATCH 1/2] mm: restrictedmem: Allow userspace to specify mount_path for memfd_restricted

2023-02-22 Thread Ackerley Tng




"Kirill A. Shutemov"  writes:


On Thu, Feb 16, 2023 at 12:41:16AM +, Ackerley Tng wrote:

By default, the backing shmem file for a restrictedmem fd is created
on shmem's kernel space mount.



With this patch, an optional tmpfs mount can be specified, which will
be used as the mountpoint for backing the shmem file associated with a
restrictedmem fd.



This change is modeled after how sys_open() can create an unnamed
temporary file in a given directory with O_TMPFILE.



This will help restrictedmem fds inherit the properties of the
provided tmpfs mounts, for example, hugepage allocation hints, NUMA
binding hints, etc.



Signed-off-by: Ackerley Tng 
---
  include/linux/syscalls.h   |  2 +-
  include/uapi/linux/restrictedmem.h |  8 
  mm/restrictedmem.c | 63 +++---
  3 files changed, 66 insertions(+), 7 deletions(-)
  create mode 100644 include/uapi/linux/restrictedmem.h



diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index f9e9e0c820c5..4b8efe9a8680 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -1056,7 +1056,7 @@ asmlinkage long sys_memfd_secret(unsigned int  
flags);
  asmlinkage long sys_set_mempolicy_home_node(unsigned long start,  
unsigned long len,

unsigned long home_node,
unsigned long flags);
-asmlinkage long sys_memfd_restricted(unsigned int flags);
+asmlinkage long sys_memfd_restricted(unsigned int flags, const char  
__user *mount_path);



  /*
   * Architecture-specific system calls



I'm not sure what the right practice now: do we provide string that
contains mount path or fd that represents the filesystem (returned from
fsmount(2) or open_tree(2)).



fd seems more flexible: it allows to specify unbind mounts.


I tried out the suggestion of passing fds to memfd_restricted() instead
of strings.

One benefit I see of using fds is interface uniformity: it feels more
aligned with other syscalls like fsopen(), fsconfig(), and fsmount() in
terms of using and passing around fds.

Other than being able to use a mount without a path attached to the
mount, are there any other benefits of using fds over using the path string?

Should I post the patches that allows specifying a mount using fds?
Should I post them as a separate RFC, or as a new revision to this RFC?

Re: [PATCH v1 5/6] hw/arm/virt: Enable backup bitmap for dirty ring

2023-02-22 Thread Gavin Shan


On 2/23/23 2:54 AM, Peter Maydell wrote:

On Wed, 22 Feb 2023 at 04:36, Gavin Shan  wrote:


On 2/22/23 3:27 AM, Peter Maydell wrote:

Why does this need to be board-specific code? Is there
some way we can just do the right thing automatically?
Why does the GIC/ITS matter?

The kernel should already know whether we have asked it
to do something that needs this extra extension, so
I think we ought to be able in the generic "enable the
dirty ring" code say "if the kernel says we need this
extra thing, also enable this extra thing". Or if that's
too early, we can do the extra part in a generic hook a
bit later.

In the future there might be other things, presumably,
that need the backup bitmap, so it would be more future
proof not to need to also change QEMU to add extra
logic checks that duplicate the logic the kernel already has.



When the dirty ring is enabled, a per-vcpu buffer is used to track the dirty 
pages.
The prerequisite to use the per-vcpu buffer is existing running VCPU context. 
There
are two cases where no running VCPU context exists and the backup bitmap 
extension
is needed, as we know until now: (a) save/restore GICv3 tables; (b) 
save/restore ITS
tables; These two cases are related to KVM device "kvm-arm-gicv3" and 
"arm-its-kvm",
which are only needed by virt machine at present. So we needn't the backup 
bitmap
extension for other boards.


But we might have to for other boards we add later. We shouldn't
put code in per-board if it's not really board specific.

Moreover, I think "we need the backup bitmap if the kernel is
using its GICv3 or ITS implementation" is a kernel implementation
detail. It seems to me that it would be cleaner if QEMU didn't
have to hardcode "we happen to know that these are the situations
when we need to do that". A better API would be "ask the kernel
'do we need this?' and enable it if it says 'yes'". The kernel
knows what its implementations of ITS and GICv3 (and perhaps
future in-kernel memory-using devices) require, after all.



Well, As we know so far, the backup bitmap extension is only required by 
'kvm-arm-gicv3'
and 'arm-its-kvm' device. Those two devices are only used by virt machine at 
present.
So it's a board specific requirement. I'm not sure about the future. We may 
need to
enable the extension for other devices and other boards. That time, the 
requirement
isn't board specific any more. However, we're uncertain for the future.

In order to cover the future requirement, the extension is needed by other 
boards,
the best way I can figure out is to enable the extension in generic path in 
kvm_init()
if the extension is supported by the host kernel. In this way, the unnecessary 
overhead
is introduced for those boards where 'kvm-arm-vgic3' and 'arm-its-kvm' aren't 
used.
The overhead should be very small and acceptable. Note that the host kernel 
don't know
if 'kvm-arm-vgic3' or 'arm-its-kvm' device is needed by the board in 
kvm_init(), which
is the generic path.

The 'kvm-arm-vgic3' and 'arm-its-kvm' devices are created in machvirt_init(), 
where
the memory slots are also added. Prior to the function, host kernel doesn't 
know if
the extension is needed by QEMU. It means we have to enable the extension in 
machvirt_init(),
which is exactly what we're doing. The difference is QEMU decides to enable the 
extension
instead of being told to enable it by host kernel. Host kernel doesn't have the 
answer to
"Hey host kernel, do we need to enable the extension" until machvirt_init() 
where the devices
are created. Besides, machvirt_init() isn't the generic path if we want to 
enable the extension
for all possible boards. Further more, the extension can't be enabled if memory 
slots have been
added.

In summary, the best way I can figure out is to enable the extension in 
kvm_init() if it
has been supported by host kernel, to cover all possible boards for future 
cases. Otherwise,
we keep what we're doing to enable the extension in machvirt_init(). Please let 
me know your
thoughts, Peter :)

Thanks,
Gavin

Re: [PATCH 0/5] Pegasos2 fixes and audio output support

2023-02-22 Thread BALATON Zoltan

On Wed, 22 Feb 2023, BALATON Zoltan wrote:
> On Wed, 22 Feb 2023, Bernhard Beschow wrote:
>> Am 22. Februar 2023 19:25:16 UTC schrieb BALATON Zoltan 
>> :
>>> On Wed, 22 Feb 2023, Bernhard Beschow wrote:
 On Wed, Feb 22, 2023 at 4:38 PM Bernhard Beschow  
 wrote:
> On Tue, Feb 21, 2023 at 7:44 PM BALATON Zoltan  
> wrote:
>> This series fixes PCI interrupts on the ppc/pegasos2 machine and adds
>> partial implementation of the via-ac97 sound part enough to get audio
>> output. I'd like this to be merged for QEMU 8.0.
>> 
>> Regards,
>> BALATON Zoltan
>> 
>> BALATON Zoltan (5):
>>   hw/isa/vt82c686: Implement interrupt routing in via_isa_set_irq
>>   hw/isa/vt82c686: Implement PIRQ pins
>>   hw/ppc/pegasos2: Fix PCI interrupt routing
>>   hw/audio/ac97: Split off some definitions to a header
>>   hw/audio/via-ac97: Basic implementation of audio playback
>>
>>  hw/audio/ac97.c|  43 +---
>>  hw/audio/ac97.h|  65 ++
>>  hw/audio/trace-events  |   6 +
>>  hw/audio/via-ac97.c| 436 -
>>  hw/ide/via.c   |   2 +-
>>  hw/isa/vt82c686.c  |  61 +-
>>  hw/pci-host/mv64361.c  |   4 -
>>  hw/ppc/pegasos2.c  |  26 ++-
>>  hw/usb/vt82c686-uhci-pci.c |   5 +-
>>  include/hw/isa/vt82c686.h  |  39 +++-
>>  10 files changed, 626 insertions(+), 61 deletions(-)
>>  create mode 100644 hw/audio/ac97.h
>> 
>> --
>> 2.30.7
>> 
>> 
> Wow, the MorphOS people paid attention to sound design. Thanks for
> presenting it to us, Zoltan!
> 
> I've had a closer look at your series and I think it can be simplified:
> Patch 2 can be implemented quite straight-forward like I proposed in a
> private mail: https://github.com/shentok/qemu/commit/via-priq-routing.
> Then, in order to make patch 3 "hw/ppc/pegasos2: Fix PCI interrupt 
> routing"
> working, one can expose the PCI interrupts with a single line like you 
> do
> in patch 2. With this, patch 1 "hw/isa/vt82c686: Implement interrupt
> routing in via_isa_set_irq" isn't needed any longer and can be omitted.
> 
> In via-ac97, rather than using via_isa_set_irq(), pci_set_irq() can be
> used instead. pci_set_irq() internally takes care of all the ISA 
> interrupt
> level tracking patch 1 attempted to address.
> 
 
 Here is a proof of concept branch to demonstrate that the simplification
 actually works: https://github.com/shentok/qemu/commits/pegasos2 (Tested
 with MorphOS with and without pegasos2.rom).
>>> 
>>> Does this only work because both the via-ac97 and the PCI interrupts are 
>>> mapped to the same ISA IRQ and you've only tested sound? The guest could 
>>> configure each device to use a different IRQ, also mapping them so they 
>>> share one ISA interrupt. What happens if multiple devices are mapped to 
>>> IRQ 9 (which is the case on pegasos2 where PCI cards, ac97 and USB all 
>>> share this IRQ) and more than one such device wants to raise an interrupt 
>>> at the same time? If you ack the ac97 interrupt but a PCI network card or 
>>> the USB part still wants to get the CPUs attention the ISA IRQ should 
>>> remain raised until all devices are serviced.
>> 
>> pci_bus_get_irq_level(), used in via_isa_set_pci_irq(), should handle
>> exactly that case very well.
>> 
>>> I don't see a way to track the status of all devices in a single qemu_irq 
>>> which can only be up or down so we need something to store the state of 
>>> each source.
>> 
>> pci_set_irq() causes pci_bus_change_irq_level() to be called.
>> pci_bus_change_irq_level() tracks the sum of all irq levels of all
>> devices attached to a particular pin in irq_count. Have a look at
>> pci_bus_change_irq_level() and you will understand better.
>> 
>>> My patch adds a state register to each ISA IRQ line for all possible 
>>> sources which could probably be stored once but then for each change of 
>>> ISA IRQ status all the mapped devices should be checked and combined so 
>>> it's easier to store them for each IRQ. Does your approach still work if 
>>> you play sound, and copy something from network to a USB device at the 
>>> same time? (I'm not sure mine does not have remaining bugs but I don't 
>>> think this can be simplified that way but if you can prove it would work I 
>>> don't mind taking an alternative version but I'm not convinced yet.)
>> 
>> Well, I can't prove that my approach works but unfortunately I can
>> prove that both our approaches cause a freeze :/ Try:
>> 1. Start `qemu-system-ppc -M pegasos2 -bios pegasos2.rom -rtc
>> base=localtime -device ati-vga,guest_hwcursor=true,romfile="" -cdrom
>> morphos-3.17.iso -device usb-mouse -device usb-kbd`
>> 2. Move the mouse while sound is playing
>> -> Observe the VM to freeze
>
> Not quite sure why but it seems to happen when both the a

Re: [PATCH v2 17/20] vfio/common: Support device dirty page tracking with vIOMMU

2023-02-22 Thread Alex Williamson

On Wed, 22 Feb 2023 19:49:12 +0200
Avihai Horon  wrote:

> Currently, device dirty page tracking with vIOMMU is not supported - RAM
> pages are perpetually marked dirty in this case.
> 
> When vIOMMU is used, IOVA ranges are DMA mapped/unmapped on the fly as
> the vIOMMU maps/unmaps them. These IOVA ranges can potentially be mapped
> anywhere in the vIOMMU IOVA space.
> 
> Due to this dynamic nature of vIOMMU mapping/unmapping, tracking only
> the currently mapped IOVA ranges, as done in the non-vIOMMU case,
> doesn't work very well.
> 
> Instead, to support device dirty tracking when vIOMMU is enabled, track
> the entire vIOMMU IOVA space. If that fails (IOVA space can be rather
> big and we might hit HW limitation), try tracking smaller range while
> marking untracked ranges dirty.
> 
> Signed-off-by: Avihai Horon 
> ---
>  include/hw/vfio/vfio-common.h |   2 +
>  hw/vfio/common.c  | 196 +++---
>  2 files changed, 181 insertions(+), 17 deletions(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 1f21e1fa43..1dc00cabcd 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -95,6 +95,8 @@ typedef struct VFIOContainer {
>  unsigned int dma_max_mappings;
>  IOVATree *mappings;
>  QemuMutex mappings_mutex;
> +/* Represents the range [0, giommu_tracked_range) not inclusive */
> +hwaddr giommu_tracked_range;
>  QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
>  QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
>  QLIST_HEAD(, VFIOGroup) group_list;
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 4a7fff6eeb..1024788bcc 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -45,6 +45,8 @@
>  #include "migration/qemu-file.h"
>  #include "sysemu/tpm.h"
>  #include "qemu/iova-tree.h"
> +#include "hw/boards.h"
> +#include "hw/mem/memory-device.h"
>  
>  VFIOGroupList vfio_group_list =
>  QLIST_HEAD_INITIALIZER(vfio_group_list);
> @@ -430,6 +432,38 @@ void vfio_unblock_multiple_devices_migration(void)
>  multiple_devices_migration_blocker = NULL;
>  }
>  
> +static uint64_t vfio_get_ram_size(void)
> +{
> +MachineState *ms = MACHINE(qdev_get_machine());
> +uint64_t plugged_size;
> +
> +plugged_size = get_plugged_memory_size();
> +if (plugged_size == (uint64_t)-1) {
> +plugged_size = 0;
> +}
> +
> +return ms->ram_size + plugged_size;
> +}
> +
> +static int vfio_iommu_get_max_iova(VFIOContainer *container, hwaddr 
> *max_iova)
> +{
> +VFIOGuestIOMMU *giommu;
> +int ret;
> +
> +giommu = QLIST_FIRST(&container->giommu_list);
> +if (!giommu) {
> +return -ENOENT;
> +}
> +
> +ret = memory_region_iommu_get_attr(giommu->iommu_mr, IOMMU_ATTR_MAX_IOVA,
> +   max_iova);
> +if (ret) {
> +return ret;
> +}
> +
> +return 0;
> +}
> +
>  static bool vfio_have_giommu(VFIOContainer *container)
>  {
>  return !QLIST_EMPTY(&container->giommu_list);
> @@ -1510,7 +1544,8 @@ static gboolean vfio_iova_tree_get_last(DMAMap *map, 
> gpointer data)
>  }
>  
>  static struct vfio_device_feature *
> -vfio_device_feature_dma_logging_start_create(VFIOContainer *container)
> +vfio_device_feature_dma_logging_start_create(VFIOContainer *container,
> + bool giommu)
>  {
>  struct vfio_device_feature *feature;
>  size_t feature_size;
> @@ -1529,6 +1564,16 @@ 
> vfio_device_feature_dma_logging_start_create(VFIOContainer *container)
>  control = (struct vfio_device_feature_dma_logging_control 
> *)feature->data;
>  control->page_size = qemu_real_host_page_size();
>  
> +if (giommu) {
> +ranges = g_malloc0(sizeof(*ranges));
> +ranges->iova = 0;
> +ranges->length = container->giommu_tracked_range;
> +control->num_ranges = 1;
> +control->ranges = (uint64_t)ranges;
> +
> +return feature;
> +}
> +
>  QEMU_LOCK_GUARD(&container->mappings_mutex);
>  
>  /*
> @@ -1578,12 +1623,12 @@ static void 
> vfio_device_feature_dma_logging_start_destroy(
>  g_free(feature);
>  }
>  
> -static int vfio_devices_dma_logging_start(VFIOContainer *container)
> +static int vfio_devices_dma_logging_start(VFIOContainer *container, bool 
> giommu)
>  {
>  struct vfio_device_feature *feature;
>  int ret;
>  
> -feature = vfio_device_feature_dma_logging_start_create(container);
> +feature = vfio_device_feature_dma_logging_start_create(container, 
> giommu);
>  if (!feature) {
>  return -errno;
>  }
> @@ -1598,18 +1643,128 @@ static int 
> vfio_devices_dma_logging_start(VFIOContainer *container)
>  return ret;
>  }
>  
> +typedef struct {
> +hwaddr *ranges;
> +unsigned int ranges_num;
> +} VFIOGIOMMUDeviceDTRanges;
> +
> +/*
> + * This value is used in the second attempt to start device dirty tracking 
> with
> + * vIOMMU, or if the giommu f

[PATCH v2 20/28] target/hppa: Don't use tcg_temp_local_new

2023-02-22 Thread Richard Henderson

This wasn't actually used at all, just some unused
macro re-definitions.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/hppa/translate.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/target/hppa/translate.c b/target/hppa/translate.c
index 0102cf451b..cee960949f 100644
--- a/target/hppa/translate.c
+++ b/target/hppa/translate.c
@@ -35,7 +35,6 @@
 #undef TCGv
 #undef tcg_temp_new
 #undef tcg_global_mem_new
-#undef tcg_temp_local_new
 #undef tcg_temp_free
 
 #if TARGET_LONG_BITS == 64
@@ -59,7 +58,6 @@
 
 #define tcg_temp_new tcg_temp_new_i64
 #define tcg_global_mem_new   tcg_global_mem_new_i64
-#define tcg_temp_local_new   tcg_temp_local_new_i64
 #define tcg_temp_freetcg_temp_free_i64
 
 #define tcg_gen_movi_reg tcg_gen_movi_i64
@@ -155,7 +153,6 @@
 #define TCGv_reg TCGv_i32
 #define tcg_temp_new tcg_temp_new_i32
 #define tcg_global_mem_new   tcg_global_mem_new_i32
-#define tcg_temp_local_new   tcg_temp_local_new_i32
 #define tcg_temp_freetcg_temp_free_i32
 
 #define tcg_gen_movi_reg tcg_gen_movi_i32
-- 
2.34.1

Re: [PATCH] target/i386: Fix BZHI instruction

2023-02-22 Thread Richard Henderson


Ping 2.

r~

On 2/15/23 20:50, Richard Henderson wrote:

Ping.

r~

On 1/14/23 13:32, Richard Henderson wrote:

We did not correctly handle N >= operand size.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1374
Signed-off-by: Richard Henderson 
---
  tests/tcg/i386/test-i386-bmi2.c |  3 +++
  target/i386/tcg/emit.c.inc  | 14 +++---
  2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/tests/tcg/i386/test-i386-bmi2.c b/tests/tcg/i386/test-i386-bmi2.c
index 982d4abda4..0244df7987 100644
--- a/tests/tcg/i386/test-i386-bmi2.c
+++ b/tests/tcg/i386/test-i386-bmi2.c
@@ -123,6 +123,9 @@ int main(int argc, char *argv[]) {
  result = bzhiq(mask, 0x1f);
  assert(result == (mask & ~(-1 << 30)));
+    result = bzhiq(mask, 0x40);
+    assert(result == mask);
+
  result = rorxq(0x2132435465768798, 8);
  assert(result == 0x9821324354657687);
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 4d7702c106..1eace1231a 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1143,20 +1143,20 @@ static void gen_BLSR(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)

  static void gen_BZHI(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
  {
  MemOp ot = decode->op[0].ot;
-    TCGv bound;
+    TCGv bound = tcg_constant_tl(ot == MO_64 ? 63 : 31);
+    TCGv zero = tcg_constant_tl(0);
+    TCGv mone = tcg_constant_tl(-1);
-    tcg_gen_ext8u_tl(s->T1, cpu_regs[s->vex_v]);
-    bound = tcg_constant_tl(ot == MO_64 ? 63 : 31);
+    tcg_gen_ext8u_tl(s->T1, s->T1);
  /*
   * Note that since we're using BMILG (in order to get O
   * cleared) we need to store the inverse into C.
   */
-    tcg_gen_setcond_tl(TCG_COND_LT, cpu_cc_src, s->T1, bound);
-    tcg_gen_movcond_tl(TCG_COND_GT, s->T1, s->T1, bound, bound, s->T1);
+    tcg_gen_setcond_tl(TCG_COND_LEU, cpu_cc_src, s->T1, bound);
-    tcg_gen_movi_tl(s->A0, -1);
-    tcg_gen_shl_tl(s->A0, s->A0, s->T1);
+    tcg_gen_shl_tl(s->A0, mone, s->T1);
+    tcg_gen_movcond_tl(TCG_COND_LEU, s->A0, s->T1, bound, s->A0, zero);
  tcg_gen_andc_tl(s->T0, s->T0, s->A0);
  gen_op_update1_cc(s);

[PATCH v2 18/28] target/cris: Don't use tcg_temp_local_new

2023-02-22 Thread Richard Henderson

Since tcg_temp_new is now identical, use that.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/cris/translate.c |  6 +++---
 target/cris/translate_v10.c.inc | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/cris/translate.c b/target/cris/translate.c
index 905d01288e..a959b27373 100644
--- a/target/cris/translate.c
+++ b/target/cris/translate.c
@@ -1621,7 +1621,7 @@ static int dec_bound_r(CPUCRISState *env, DisasContext 
*dc)
 LOG_DIS("bound.%c $r%u, $r%u\n",
 memsize_char(size), dc->op1, dc->op2);
 cris_cc_mask(dc, CC_MASK_NZ);
-l0 = tcg_temp_local_new();
+l0 = tcg_temp_new();
 dec_prep_move_r(dc, dc->op1, dc->op2, size, 0, l0);
 cris_alu(dc, CC_OP_BOUND, cpu_R[dc->op2], cpu_R[dc->op2], l0, 4);
 tcg_temp_free(l0);
@@ -2404,8 +2404,8 @@ static int dec_bound_m(CPUCRISState *env, DisasContext 
*dc)
 dc->op1, dc->postinc ? "+]" : "]",
 dc->op2);
 
-l[0] = tcg_temp_local_new();
-l[1] = tcg_temp_local_new();
+l[0] = tcg_temp_new();
+l[1] = tcg_temp_new();
 insn_len = dec_prep_alu_m(env, dc, 0, memsize, l[0], l[1]);
 cris_cc_mask(dc, CC_MASK_NZ);
 cris_alu(dc, CC_OP_BOUND, cpu_R[dc->op2], l[0], l[1], 4);
diff --git a/target/cris/translate_v10.c.inc b/target/cris/translate_v10.c.inc
index f500e93447..9660f28584 100644
--- a/target/cris/translate_v10.c.inc
+++ b/target/cris/translate_v10.c.inc
@@ -68,9 +68,9 @@ static void gen_store_v10_conditional(DisasContext *dc, TCGv 
addr, TCGv val,
unsigned int size, int mem_index)
 {
 TCGLabel *l1 = gen_new_label();
-TCGv taddr = tcg_temp_local_new();
-TCGv tval = tcg_temp_local_new();
-TCGv t1 = tcg_temp_local_new();
+TCGv taddr = tcg_temp_new();
+TCGv tval = tcg_temp_new();
+TCGv t1 = tcg_temp_new();
 dc->postinc = 0;
 cris_evaluate_flags(dc);
 
@@ -434,7 +434,7 @@ static void dec10_reg_bound(DisasContext *dc, int size)
 {
 TCGv t;
 
-t = tcg_temp_local_new();
+t = tcg_temp_new();
 t_gen_zext(t, cpu_R[dc->src], size);
 cris_alu(dc, CC_OP_BOUND, cpu_R[dc->dst], cpu_R[dc->dst], t, 4);
 tcg_temp_free(t);
@@ -935,7 +935,7 @@ static int dec10_ind_bound(CPUCRISState *env, DisasContext 
*dc,
 int rd = dc->dst;
 TCGv t;
 
-t = tcg_temp_local_new();
+t = tcg_temp_new();
 insn_len += dec10_prep_move_m(env, dc, 0, size, t);
 cris_alu(dc, CC_OP_BOUND, cpu_R[dc->dst], cpu_R[rd], t, 4);
 if (dc->dst == 15) {
-- 
2.34.1

[PATCH v2 21/28] target/i386: Don't use tcg_temp_local_new

2023-02-22 Thread Richard Henderson

Since tcg_temp_new is now identical, use that.
In some cases we can avoid a copy from A0 or T0.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/i386/tcg/translate.c | 27 +--
 1 file changed, 9 insertions(+), 18 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index a47d60f057..baf1cfc2bc 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -3426,13 +3426,10 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (mod == 3) {
 goto illegal_op;
 }
-a0 = tcg_temp_local_new();
-t0 = tcg_temp_local_new();
+a0 = s->A0;
+t0 = s->T0;
 label1 = gen_new_label();
 
-tcg_gen_mov_tl(a0, s->A0);
-tcg_gen_mov_tl(t0, s->T0);
-
 gen_set_label(label1);
 t1 = tcg_temp_new();
 t2 = tcg_temp_new();
@@ -3444,9 +3441,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 tcg_gen_brcond_tl(TCG_COND_NE, t0, t2, label1);
 
 tcg_temp_free(t2);
-tcg_temp_free(a0);
 tcg_gen_neg_tl(s->T0, t0);
-tcg_temp_free(t0);
 } else {
 tcg_gen_neg_tl(s->T0, s->T0);
 if (mod != 3) {
@@ -6248,13 +6243,13 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 #endif
 {
 TCGLabel *label1;
-TCGv t0, t1, t2, a0;
+TCGv t0, t1, t2;
 
 if (!PE(s) || VM86(s))
 goto illegal_op;
-t0 = tcg_temp_local_new();
-t1 = tcg_temp_local_new();
-t2 = tcg_temp_local_new();
+t0 = tcg_temp_new();
+t1 = tcg_temp_new();
+t2 = tcg_temp_new();
 ot = MO_16;
 modrm = x86_ldub_code(env, s);
 reg = (modrm >> 3) & 7;
@@ -6263,11 +6258,8 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 if (mod != 3) {
 gen_lea_modrm(env, s, modrm);
 gen_op_ld_v(s, ot, t0, s->A0);
-a0 = tcg_temp_local_new();
-tcg_gen_mov_tl(a0, s->A0);
 } else {
 gen_op_mov_v_reg(s, ot, t0, rm);
-a0 = NULL;
 }
 gen_op_mov_v_reg(s, ot, t1, reg);
 tcg_gen_andi_tl(s->tmp0, t0, 3);
@@ -6280,8 +6272,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 tcg_gen_movi_tl(t2, CC_Z);
 gen_set_label(label1);
 if (mod != 3) {
-gen_op_st_v(s, ot, t0, a0);
-tcg_temp_free(a0);
+gen_op_st_v(s, ot, t0, s->A0);
} else {
 gen_op_mov_reg_v(s, ot, rm, t0);
 }
@@ -6304,7 +6295,7 @@ static bool disas_insn(DisasContext *s, CPUState *cpu)
 modrm = x86_ldub_code(env, s);
 reg = ((modrm >> 3) & 7) | REX_R(s);
 gen_ldst_modrm(env, s, modrm, MO_16, OR_TMP0, 0);
-t0 = tcg_temp_local_new();
+t0 = tcg_temp_new();
 gen_update_cc_op(s);
 if (b == 0x102) {
 gen_helper_lar(t0, cpu_env, s->T0);
@@ -7052,7 +7043,7 @@ static void i386_tr_init_disas_context(DisasContextBase 
*dcbase, CPUState *cpu)
 dc->tmp2_i32 = tcg_temp_new_i32();
 dc->tmp3_i32 = tcg_temp_new_i32();
 dc->tmp4 = tcg_temp_new();
-dc->cc_srcT = tcg_temp_local_new();
+dc->cc_srcT = tcg_temp_new();
 }
 
 static void i386_tr_tb_start(DisasContextBase *db, CPUState *cpu)
-- 
2.34.1

[PATCH v2 12/28] accel/tcg/plugin: Use tcg_temp_ebb_*

2023-02-22 Thread Richard Henderson

All of these uses have quite local scope.
Avoid tcg_const_*, because we haven't added a corresponding
interface for TEMP_EBB.  Use explicit tcg_gen_movi_* instead.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 24 ++--
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 17a686bd9e..9b793ac62c 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -93,11 +93,13 @@ void HELPER(plugin_vcpu_mem_cb)(unsigned int vcpu_index,
 
 static void do_gen_mem_cb(TCGv vaddr, uint32_t info)
 {
-TCGv_i32 cpu_index = tcg_temp_new_i32();
-TCGv_i32 meminfo = tcg_const_i32(info);
-TCGv_i64 vaddr64 = tcg_temp_new_i64();
-TCGv_ptr udata = tcg_const_ptr(NULL);
+TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
+TCGv_i32 meminfo = tcg_temp_ebb_new_i32();
+TCGv_i64 vaddr64 = tcg_temp_ebb_new_i64();
+TCGv_ptr udata = tcg_temp_ebb_new_ptr();
 
+tcg_gen_movi_i32(meminfo, info);
+tcg_gen_movi_ptr(udata, 0);
 tcg_gen_ld_i32(cpu_index, cpu_env,
-offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
 tcg_gen_extu_tl_i64(vaddr64, vaddr);
@@ -112,9 +114,10 @@ static void do_gen_mem_cb(TCGv vaddr, uint32_t info)
 
 static void gen_empty_udata_cb(void)
 {
-TCGv_i32 cpu_index = tcg_temp_new_i32();
-TCGv_ptr udata = tcg_const_ptr(NULL); /* will be overwritten later */
+TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
+TCGv_ptr udata = tcg_temp_ebb_new_ptr();
 
+tcg_gen_movi_ptr(udata, 0);
 tcg_gen_ld_i32(cpu_index, cpu_env,
-offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
 gen_helper_plugin_vcpu_udata_cb(cpu_index, udata);
@@ -129,9 +132,10 @@ static void gen_empty_udata_cb(void)
  */
 static void gen_empty_inline_cb(void)
 {
-TCGv_i64 val = tcg_temp_new_i64();
-TCGv_ptr ptr = tcg_const_ptr(NULL); /* overwritten later */
+TCGv_i64 val = tcg_temp_ebb_new_i64();
+TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
 
+tcg_gen_movi_ptr(ptr, 0);
 tcg_gen_ld_i64(val, ptr, 0);
 /* pass an immediate != 0 so that it doesn't get optimized away */
 tcg_gen_addi_i64(val, val, 0xdeadface);
@@ -151,9 +155,9 @@ static void gen_empty_mem_cb(TCGv addr, uint32_t info)
  */
 static void gen_empty_mem_helper(void)
 {
-TCGv_ptr ptr;
+TCGv_ptr ptr = tcg_temp_ebb_new_ptr();
 
-ptr = tcg_const_ptr(NULL);
+tcg_gen_movi_ptr(ptr, 0);
 tcg_gen_st_ptr(ptr, cpu_env, offsetof(CPUState, plugin_mem_cbs) -
  offsetof(ArchCPU, env));
 tcg_temp_free_ptr(ptr);
-- 
2.34.1

[PATCH v2 27/28] tcg: Update docs/devel/tcg-ops.rst for temporary changes

2023-02-22 Thread Richard Henderson

Rewrite the sections which talked about 'local temporaries'.
Remove some assumptions which no longer hold.

Signed-off-by: Richard Henderson 
---
 docs/devel/tcg-ops.rst | 103 +
 1 file changed, 54 insertions(+), 49 deletions(-)

diff --git a/docs/devel/tcg-ops.rst b/docs/devel/tcg-ops.rst
index 9adc0c9b6c..53b7b6c93b 100644
--- a/docs/devel/tcg-ops.rst
+++ b/docs/devel/tcg-ops.rst
@@ -29,21 +29,42 @@ In this document, we use *guest* to specify what 
architecture we are
 emulating; *target* always means the TCG target, the machine on which
 we are running QEMU.
 
-A TCG *function* corresponds to a QEMU Translated Block (TB).
+A TCG *basic block* is a single entry, multiple exit region which
+corresponds to a list of instructions terminated by a label, or
+any branch instruction.
 
-A TCG *temporary* is a variable only live in a basic block. Temporaries are 
allocated explicitly in each function.
+A TCG *extended basic block* is a single entry, multiple exit region
+which corresponds to a list of instructions terminated by a label or
+an unconditional branch.  Specifically, an extended basic block is
+a sequence of basic blocks connected by the fall-through paths of
+zero or more conditional branch instructions.
 
-A TCG *local temporary* is a variable only live in a function. Local 
temporaries are allocated explicitly in each function.
+There is one TCG *fixed global* (``TEMP_FIXED``) variable, ``cpu_env``
+which is live in all translation blocks, and holds a pointer to 
``CPUArchState``.
+This fixed global is held in a host cpu register at all times in all
+translation blocks.
 
-A TCG *global* is a variable which is live in all the functions
-(equivalent of a C global variable). They are defined before the
-functions defined. A TCG global can be a memory location (e.g. a QEMU
-CPU register), a fixed host register (e.g. the QEMU CPU state pointer)
-or a memory location which is stored in a register outside QEMU TBs
-(not implemented yet).
+A TCG *global* (``TEMP_GLOBAL``) is a variable which is live in all
+translation blocks, and correspond to memory locations that are within
+``CPUArchState``.  These may be specified as an offset from ``cpu_env``,
+in which case they are called *direct globals*, or may be specified as
+an offset from a direct global, in which case they are called
+*indirect globals*.  Even indirect globals should still reference memory
+within ``CPUArchState``.  All TCG globals are defined during
+``TCGCPUOps.initialize``, before any translation blocks are generated.
 
-A TCG *basic block* corresponds to a list of instructions terminated
-by a branch instruction.
+A TCG *constant* (``TEMP_CONST``) is a variable which is live throughout
+the entire translation block, and contains a constant value.
+These temporaries are allocated explicitly during translation and are
+hashed so that there is exactly one variable holding a given value.
+
+A TCG *translation block temporary* (``TEMP_TB``) is a variable which is
+live throughout the entire translation block, but dies on any exit.
+These temporaries are allocated explicitly during translation.
+
+A TCG *extended basic block temporary* (``TEMP_EBB``) is a variable which
+is live throughout an extended basic block, but dies on any exit.
+These temporaries are allocated explicitly during translation.
 
 An operation with *undefined behavior* may result in a crash.
 
@@ -57,11 +78,11 @@ Intermediate representation
 Introduction
 
 
-TCG instructions operate on variables which are temporaries, local
-temporaries or globals. TCG instructions and variables are strongly
-typed. Two types are supported: 32 bit integers and 64 bit
-integers. Pointers are defined as an alias to 32 bit or 64 bit
-integers depending on the TCG target word size.
+TCG instructions operate on variables which are temporaries.
+TCG instructions and variables are strongly typed.
+Two types are supported: 32 bit integers and 64 bit integers.
+Pointers are defined as an alias to 32 bit or 64 bit integers
+depending on the TCG target word size.
 
 Each instruction has a fixed number of output variable operands, input
 variable operands and always constant operands.
@@ -81,17 +102,19 @@ included in the instruction name. Constants are prefixed 
with a '$'.
 Assumptions
 ---
 
-Basic blocks
+Basic Blocks
 
 
-* Basic blocks end after branches (e.g. brcond_i32 instruction),
-  goto_tb and exit_tb instructions.
+* Basic blocks end after conditional branches (e.g. brcond_i32),
+  br, goto_tb, exit_tb, goto_ptr, set_label instructions,
+  and calls that are defined to not return (``TCG_CALL_NO_RETURN``).
 
-* Basic blocks start after the end of a previous basic block, or at a
-  set_label instruction.
+* Basic blocks start after the end of a previous basic block,
+  or at a set_label instruction.
 
-After the end of a basic block, the content of temporaries is
-destroyed, but local temporaries and globals are prese

[PATCH v2 04/28] tcg: Remove branch-to-next regardless of reference count

2023-02-22 Thread Richard Henderson

Just because the label reference count is more than 1 does
not mean we cannot remove a branch-to-next.  By doing this
first, the label reference count may drop to 0, and then
the label itself gets removed as before.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 06209e6160..0992fb4f31 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2638,7 +2638,7 @@ TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *old_op,
 /* Reachable analysis : remove unreachable code.  */
 static void reachable_code_pass(TCGContext *s)
 {
-TCGOp *op, *op_next;
+TCGOp *op, *op_next, *op_prev;
 bool dead = false;
 
 QTAILQ_FOREACH_SAFE(op, &s->ops, link, op_next) {
@@ -2648,6 +2648,22 @@ static void reachable_code_pass(TCGContext *s)
 switch (op->opc) {
 case INDEX_op_set_label:
 label = arg_label(op->args[0]);
+
+/*
+ * Optimization can fold conditional branches to unconditional.
+ * If we find a label which is preceded by an unconditional
+ * branch to next, remove the branch.  We couldn't do this when
+ * processing the branch because any dead code between the branch
+ * and label had not yet been removed.
+ */
+op_prev = QTAILQ_PREV(op, link);
+if (op_prev->opc == INDEX_op_br &&
+label == arg_label(op_prev->args[0])) {
+tcg_op_remove(s, op_prev);
+/* Fall through means insns become live again.  */
+dead = false;
+}
+
 if (label->refs == 0) {
 /*
  * While there is an occasional backward branch, virtually
@@ -2661,21 +2677,6 @@ static void reachable_code_pass(TCGContext *s)
 /* Once we see a label, insns become live again.  */
 dead = false;
 remove = false;
-
-/*
- * Optimization can fold conditional branches to unconditional.
- * If we find a label with one reference which is preceded by
- * an unconditional branch to it, remove both.  This needed to
- * wait until the dead code in between them was removed.
- */
-if (label->refs == 1) {
-TCGOp *op_prev = QTAILQ_PREV(op, link);
-if (op_prev->opc == INDEX_op_br &&
-label == arg_label(op_prev->args[0])) {
-tcg_op_remove(s, op_prev);
-remove = true;
-}
-}
 }
 break;
 
-- 
2.34.1

[PATCH v2 25/28] exec/gen-icount: Don't use tcg_temp_local_new_i32

2023-02-22 Thread Richard Henderson

Since tcg_temp_new_i32 is now identical, use that.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/gen-icount.h | 8 +---
 1 file changed, 1 insertion(+), 7 deletions(-)

diff --git a/include/exec/gen-icount.h b/include/exec/gen-icount.h
index c57204ddad..21a1bff8b7 100644
--- a/include/exec/gen-icount.h
+++ b/include/exec/gen-icount.h
@@ -19,13 +19,7 @@ static inline void gen_io_start(void)
 
 static inline void gen_tb_start(const TranslationBlock *tb)
 {
-TCGv_i32 count;
-
-if (tb_cflags(tb) & CF_USE_ICOUNT) {
-count = tcg_temp_local_new_i32();
-} else {
-count = tcg_temp_new_i32();
-}
+TCGv_i32 count = tcg_temp_new_i32();
 
 tcg_gen_ld_i32(count, cpu_env,
offsetof(ArchCPU, neg.icount_decr.u32) -
-- 
2.34.1

[PATCH v2 26/28] tcg: Remove tcg_temp_local_new_, tcg_const_local_

2023-02-22 Thread Richard Henderson

These symbols are now unused.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op.h |  2 --
 include/tcg/tcg.h| 28 
 tcg/tcg.c| 16 
 3 files changed, 46 deletions(-)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index 66b1461caa..353d430a63 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -828,14 +828,12 @@ static inline void tcg_gen_plugin_cb_end(void)
 #if TARGET_LONG_BITS == 32
 #define tcg_temp_new() tcg_temp_new_i32()
 #define tcg_global_mem_new tcg_global_mem_new_i32
-#define tcg_temp_local_new() tcg_temp_local_new_i32()
 #define tcg_temp_free tcg_temp_free_i32
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i32
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i32
 #else
 #define tcg_temp_new() tcg_temp_new_i64()
 #define tcg_global_mem_new tcg_global_mem_new_i64
-#define tcg_temp_local_new() tcg_temp_local_new_i64()
 #define tcg_temp_free tcg_temp_free_i64
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i64
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i64
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 2e220d4040..7e2b954dbc 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -905,12 +905,6 @@ static inline TCGv_i32 tcg_temp_new_i32(void)
 return temp_tcgv_i32(t);
 }
 
-static inline TCGv_i32 tcg_temp_local_new_i32(void)
-{
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, TEMP_TB);
-return temp_tcgv_i32(t);
-}
-
 static inline TCGv_i64 tcg_global_mem_new_i64(TCGv_ptr reg, intptr_t offset,
   const char *name)
 {
@@ -931,12 +925,6 @@ static inline TCGv_i64 tcg_temp_new_i64(void)
 return temp_tcgv_i64(t);
 }
 
-static inline TCGv_i64 tcg_temp_local_new_i64(void)
-{
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, TEMP_TB);
-return temp_tcgv_i64(t);
-}
-
 /* Used only by tcg infrastructure: tcg-op.c or plugin-gen.c */
 static inline TCGv_i128 tcg_temp_ebb_new_i128(void)
 {
@@ -950,12 +938,6 @@ static inline TCGv_i128 tcg_temp_new_i128(void)
 return temp_tcgv_i128(t);
 }
 
-static inline TCGv_i128 tcg_temp_local_new_i128(void)
-{
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, TEMP_TB);
-return temp_tcgv_i128(t);
-}
-
 static inline TCGv_ptr tcg_global_mem_new_ptr(TCGv_ptr reg, intptr_t offset,
   const char *name)
 {
@@ -976,12 +958,6 @@ static inline TCGv_ptr tcg_temp_new_ptr(void)
 return temp_tcgv_ptr(t);
 }
 
-static inline TCGv_ptr tcg_temp_local_new_ptr(void)
-{
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, TEMP_TB);
-return temp_tcgv_ptr(t);
-}
-
 #if defined(CONFIG_DEBUG_TCG)
 /* If you call tcg_clear_temp_count() at the start of a section of
  * code which is not supposed to leak any TCG temporaries, then
@@ -1084,8 +1060,6 @@ void tcg_optimize(TCGContext *s);
 /* Allocate a new temporary and initialize it with a constant. */
 TCGv_i32 tcg_const_i32(int32_t val);
 TCGv_i64 tcg_const_i64(int64_t val);
-TCGv_i32 tcg_const_local_i32(int32_t val);
-TCGv_i64 tcg_const_local_i64(int64_t val);
 TCGv_vec tcg_const_zeros_vec(TCGType);
 TCGv_vec tcg_const_ones_vec(TCGType);
 TCGv_vec tcg_const_zeros_vec_matching(TCGv_vec);
@@ -1113,11 +1087,9 @@ TCGv_vec tcg_constant_vec_matching(TCGv_vec match, 
unsigned vece, int64_t val);
 
 #if UINTPTR_MAX == UINT32_MAX
 # define tcg_const_ptr(x)((TCGv_ptr)tcg_const_i32((intptr_t)(x)))
-# define tcg_const_local_ptr(x)  ((TCGv_ptr)tcg_const_local_i32((intptr_t)(x)))
 # define tcg_constant_ptr(x) ((TCGv_ptr)tcg_constant_i32((intptr_t)(x)))
 #else
 # define tcg_const_ptr(x)((TCGv_ptr)tcg_const_i64((intptr_t)(x)))
-# define tcg_const_local_ptr(x)  ((TCGv_ptr)tcg_const_local_i64((intptr_t)(x)))
 # define tcg_constant_ptr(x) ((TCGv_ptr)tcg_constant_i64((intptr_t)(x)))
 #endif
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 9f1b042ecd..4b244eebc2 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1476,22 +1476,6 @@ TCGv_i64 tcg_const_i64(int64_t val)
 return t0;
 }
 
-TCGv_i32 tcg_const_local_i32(int32_t val)
-{
-TCGv_i32 t0;
-t0 = tcg_temp_local_new_i32();
-tcg_gen_movi_i32(t0, val);
-return t0;
-}
-
-TCGv_i64 tcg_const_local_i64(int64_t val)
-{
-TCGv_i64 t0;
-t0 = tcg_temp_local_new_i64();
-tcg_gen_movi_i64(t0, val);
-return t0;
-}
-
 #if defined(CONFIG_DEBUG_TCG)
 void tcg_clear_temp_count(void)
 {
-- 
2.34.1

[PATCH v2 19/28] target/hexagon: Don't use tcg_temp_local_new_*

2023-02-22 Thread Richard Henderson

Since tcg_temp_new_* is now identical, use those.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/hexagon/idef-parser/README.rst   |  4 ++--
 target/hexagon/gen_tcg.h|  4 ++--
 target/hexagon/genptr.c | 16 
 target/hexagon/idef-parser/parser-helpers.c |  4 ++--
 target/hexagon/translate.c  |  2 +-
 target/hexagon/README   |  8 
 target/hexagon/gen_tcg_funcs.py | 18 +++---
 7 files changed, 26 insertions(+), 30 deletions(-)

diff --git a/target/hexagon/idef-parser/README.rst 
b/target/hexagon/idef-parser/README.rst
index ff6d14150a..c230fec124 100644
--- a/target/hexagon/idef-parser/README.rst
+++ b/target/hexagon/idef-parser/README.rst
@@ -294,9 +294,9 @@ generators the previous declarations are mapped to
 
 ::
 
-int var1;   ->  TCGv_i32 var1 = tcg_temp_local_new_i32();
+int var1;   ->  TCGv_i32 var1 = tcg_temp_new_i32();
 
-int var2 = 0;   ->  TCGv_i32 var1 = tcg_temp_local_new_i32();
+int var2 = 0;   ->  TCGv_i32 var1 = tcg_temp_new_i32();
 tcg_gen_movi_i32(j, ((int64_t) 0ULL));
 
 which are later automatically freed at the end of the function they're declared
diff --git a/target/hexagon/gen_tcg.h b/target/hexagon/gen_tcg.h
index 19697b42a5..a219a7f5dd 100644
--- a/target/hexagon/gen_tcg.h
+++ b/target/hexagon/gen_tcg.h
@@ -337,7 +337,7 @@
  */
 #define fGEN_TCG_PRED_LOAD(GET_EA, PRED, SIZE, SIGN) \
 do { \
-TCGv LSB = tcg_temp_local_new(); \
+TCGv LSB = tcg_temp_new(); \
 TCGLabel *label = gen_new_label(); \
 tcg_gen_movi_tl(EA, 0); \
 PRED;  \
@@ -397,7 +397,7 @@
 /* Predicated loads into a register pair */
 #define fGEN_TCG_PRED_LOAD_PAIR(GET_EA, PRED) \
 do { \
-TCGv LSB = tcg_temp_local_new(); \
+TCGv LSB = tcg_temp_new(); \
 TCGLabel *label = gen_new_label(); \
 tcg_gen_movi_tl(EA, 0); \
 PRED;  \
diff --git a/target/hexagon/genptr.c b/target/hexagon/genptr.c
index 90db99024f..591461b043 100644
--- a/target/hexagon/genptr.c
+++ b/target/hexagon/genptr.c
@@ -706,7 +706,7 @@ static void gen_cond_call(DisasContext *ctx, TCGv pred,
   TCGCond cond, int pc_off)
 {
 TCGv next_PC;
-TCGv lsb = tcg_temp_local_new();
+TCGv lsb = tcg_temp_new();
 TCGLabel *skip = gen_new_label();
 tcg_gen_andi_tl(lsb, pred, 1);
 gen_write_new_pc_pcrel(ctx, pc_off, cond, lsb);
@@ -720,7 +720,7 @@ static void gen_cond_call(DisasContext *ctx, TCGv pred,
 
 static void gen_endloop0(DisasContext *ctx)
 {
-TCGv lpcfg = tcg_temp_local_new();
+TCGv lpcfg = tcg_temp_new();
 
 GET_USR_FIELD(USR_LPCFG, lpcfg);
 
@@ -852,7 +852,7 @@ static void gen_sar(TCGv dst, TCGv src, TCGv shift_amt)
 /* Bidirectional shift right with saturation */
 static void gen_asr_r_r_sat(TCGv RdV, TCGv RsV, TCGv RtV)
 {
-TCGv shift_amt = tcg_temp_local_new();
+TCGv shift_amt = tcg_temp_new();
 TCGLabel *positive = gen_new_label();
 TCGLabel *done = gen_new_label();
 
@@ -876,7 +876,7 @@ static void gen_asr_r_r_sat(TCGv RdV, TCGv RsV, TCGv RtV)
 /* Bidirectional shift left with saturation */
 static void gen_asl_r_r_sat(TCGv RdV, TCGv RsV, TCGv RtV)
 {
-TCGv shift_amt = tcg_temp_local_new();
+TCGv shift_amt = tcg_temp_new();
 TCGLabel *positive = gen_new_label();
 TCGLabel *done = gen_new_label();
 
@@ -918,7 +918,7 @@ static void gen_log_vreg_write(DisasContext *ctx, intptr_t 
srcoff, int num,
 intptr_t dstoff;
 
 if (is_predicated) {
-TCGv cancelled = tcg_temp_local_new();
+TCGv cancelled = tcg_temp_new();
 label_end = gen_new_label();
 
 /* Don't do anything if the slot was cancelled */
@@ -959,7 +959,7 @@ static void gen_log_qreg_write(intptr_t srcoff, int num, 
int vnew,
 intptr_t dstoff;
 
 if (is_predicated) {
-TCGv cancelled = tcg_temp_local_new();
+TCGv cancelled = tcg_temp_new();
 label_end = gen_new_label();
 
 /* Don't do anything if the slot was cancelled */
@@ -1164,10 +1164,10 @@ void gen_satu_i64_ovfl(TCGv ovfl, TCGv_i64 dest, 
TCGv_i64 source, int width)
 /* Implements the fADDSAT64 macro in TCG */
 void gen_add_sat_i64(TCGv_i64 ret, TCGv_i64 a, TCGv_i64 b)
 {
-TCGv_i64 sum = tcg_temp_local_new_i64();
+TCGv_i64 sum = tcg_temp_new_i64();
 TCGv_i64 xor = tcg_temp_new_i64();
 TCGv_i64 cond1 = tcg_temp_new_i64();
-TCGv_i64 cond2 = tcg_temp_local_new_i64();
+TCGv_i64 cond2 = tcg_temp_new_i64();
 TCGv_i64 cond3 = tcg_temp_new_i64();
 TCGv_i64 mask = tcg_constant_i64(0x8000ULL);
 TCGv_i64 max_pos = tcg_constant_i64(0x7FFFLL);
diff --git a/target/hexagon/idef-parser/parser-helpers.c 
b/target/hexagon/idef-parser/parser-helpers.c
index 8110686c51..dfb9c65b52 100644
--- a/target/hexagon/idef-p

[PATCH v2 05/28] tcg: Rename TEMP_LOCAL to TEMP_TB

2023-02-22 Thread Richard Henderson

Use TEMP_TB as that is more explicit about the default
lifetime of the data.  While "global" and "local" used
to be contrasting, we have more lifetimes than that now.

Do not yet rename tcg_temp_local_new_*, just the enum.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h | 12 
 tcg/optimize.c|  2 +-
 tcg/tcg.c | 18 +-
 3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 59854f95b1..2010e746ca 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -433,11 +433,15 @@ typedef enum TCGTempVal {
 typedef enum TCGTempKind {
 /* Temp is dead at the end of all basic blocks. */
 TEMP_NORMAL,
-/* Temp is live across conditional branch, but dead otherwise. */
+/*
+ * Temp is dead at the end of the extended basic block (EBB),
+ * the single-entry multiple-exit region that falls through
+ * conditional branches.
+ */
 TEMP_EBB,
-/* Temp is saved across basic blocks but dead at the end of TBs. */
-TEMP_LOCAL,
-/* Temp is saved across both basic blocks and translation blocks. */
+/* Temp is live across the entire translation block, but dead at end. */
+TEMP_TB,
+/* Temp is live across the entire translation block, and between them. */
 TEMP_GLOBAL,
 /* Temp is in a fixed register. */
 TEMP_FIXED,
diff --git a/tcg/optimize.c b/tcg/optimize.c
index 763bca9ea6..ce05989c39 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -190,7 +190,7 @@ static TCGTemp *find_better_copy(TCGContext *s, TCGTemp *ts)
 } else if (i->kind > ts->kind) {
 if (i->kind == TEMP_GLOBAL) {
 g = i;
-} else if (i->kind == TEMP_LOCAL) {
+} else if (i->kind == TEMP_TB) {
 l = i;
 }
 }
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 0992fb4f31..bf2af8b0fe 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1258,7 +1258,7 @@ TCGTemp *tcg_global_mem_new_internal(TCGType type, 
TCGv_ptr base,
 TCGTemp *tcg_temp_new_internal(TCGType type, bool temp_local)
 {
 TCGContext *s = tcg_ctx;
-TCGTempKind kind = temp_local ? TEMP_LOCAL : TEMP_NORMAL;
+TCGTempKind kind = temp_local ? TEMP_TB : TEMP_NORMAL;
 TCGTemp *ts;
 int idx, k;
 
@@ -1369,7 +1369,7 @@ void tcg_temp_free_internal(TCGTemp *ts)
  */
 return;
 case TEMP_NORMAL:
-case TEMP_LOCAL:
+case TEMP_TB:
 break;
 default:
 g_assert_not_reached();
@@ -1915,7 +1915,7 @@ static void tcg_reg_alloc_start(TCGContext *s)
 case TEMP_EBB:
 val = TEMP_VAL_DEAD;
 /* fall through */
-case TEMP_LOCAL:
+case TEMP_TB:
 ts->mem_allocated = 0;
 break;
 default:
@@ -1937,7 +1937,7 @@ static char *tcg_get_arg_str_ptr(TCGContext *s, char 
*buf, int buf_size,
 case TEMP_GLOBAL:
 pstrcpy(buf, buf_size, ts->name);
 break;
-case TEMP_LOCAL:
+case TEMP_TB:
 snprintf(buf, buf_size, "loc%d", idx - s->nb_globals);
 break;
 case TEMP_EBB:
@@ -2759,7 +2759,7 @@ static void la_bb_end(TCGContext *s, int ng, int nt)
 switch (ts->kind) {
 case TEMP_FIXED:
 case TEMP_GLOBAL:
-case TEMP_LOCAL:
+case TEMP_TB:
 state = TS_DEAD | TS_MEM;
 break;
 case TEMP_NORMAL:
@@ -2804,7 +2804,7 @@ static void la_bb_sync(TCGContext *s, int ng, int nt)
 int state;
 
 switch (ts->kind) {
-case TEMP_LOCAL:
+case TEMP_TB:
 state = ts->state;
 ts->state = state | TS_MEM;
 if (state != TS_DEAD) {
@@ -3497,7 +3497,7 @@ static void temp_free_or_dead(TCGContext *s, TCGTemp *ts, 
int free_or_dead)
 case TEMP_FIXED:
 return;
 case TEMP_GLOBAL:
-case TEMP_LOCAL:
+case TEMP_TB:
 new_type = TEMP_VAL_MEM;
 break;
 case TEMP_NORMAL:
@@ -3785,7 +3785,7 @@ static void tcg_reg_alloc_bb_end(TCGContext *s, TCGRegSet 
allocated_regs)
 TCGTemp *ts = &s->temps[i];
 
 switch (ts->kind) {
-case TEMP_LOCAL:
+case TEMP_TB:
 temp_save(s, ts, allocated_regs);
 break;
 case TEMP_NORMAL:
@@ -3822,7 +3822,7 @@ static void tcg_reg_alloc_cbranch(TCGContext *s, 
TCGRegSet allocated_regs)
  * Keep tcg_debug_asserts for safety.
  */
 switch (ts->kind) {
-case TEMP_LOCAL:
+case TEMP_TB:
 tcg_debug_assert(ts->val_type != TEMP_VAL_REG || ts->mem_coherent);
 break;
 case TEMP_NORMAL:
-- 
2.34.1

[PATCH v2 23/28] target/ppc: Don't use tcg_temp_local_new

2023-02-22 Thread Richard Henderson

Since tcg_temp_new is now identical, use that.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/ppc/translate.c  | 6 +++---
 target/ppc/translate/spe-impl.c.inc | 8 
 target/ppc/translate/vmx-impl.c.inc | 4 ++--
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 5fe6aa641e..2956021e89 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4415,7 +4415,7 @@ static void gen_bcond(DisasContext *ctx, int type)
 TCGv target;
 
 if (type == BCOND_LR || type == BCOND_CTR || type == BCOND_TAR) {
-target = tcg_temp_local_new();
+target = tcg_temp_new();
 if (type == BCOND_CTR) {
 tcg_gen_mov_tl(target, cpu_ctr);
 } else if (type == BCOND_TAR) {
@@ -5594,8 +5594,8 @@ static inline void gen_405_mulladd_insn(DisasContext 
*ctx, int opc2, int opc3,
 {
 TCGv t0, t1;
 
-t0 = tcg_temp_local_new();
-t1 = tcg_temp_local_new();
+t0 = tcg_temp_new();
+t1 = tcg_temp_new();
 
 switch (opc3 & 0x0D) {
 case 0x05:
diff --git a/target/ppc/translate/spe-impl.c.inc 
b/target/ppc/translate/spe-impl.c.inc
index 2e6e799a25..bd8963db2b 100644
--- a/target/ppc/translate/spe-impl.c.inc
+++ b/target/ppc/translate/spe-impl.c.inc
@@ -168,7 +168,7 @@ static inline void gen_op_evsrwu(TCGv_i32 ret, TCGv_i32 
arg1, TCGv_i32 arg2)
 {
 TCGLabel *l1 = gen_new_label();
 TCGLabel *l2 = gen_new_label();
-TCGv_i32 t0 = tcg_temp_local_new_i32();
+TCGv_i32 t0 = tcg_temp_new_i32();
 
 /* No error here: 6 bits are used */
 tcg_gen_andi_i32(t0, arg2, 0x3F);
@@ -185,7 +185,7 @@ static inline void gen_op_evsrws(TCGv_i32 ret, TCGv_i32 
arg1, TCGv_i32 arg2)
 {
 TCGLabel *l1 = gen_new_label();
 TCGLabel *l2 = gen_new_label();
-TCGv_i32 t0 = tcg_temp_local_new_i32();
+TCGv_i32 t0 = tcg_temp_new_i32();
 
 /* No error here: 6 bits are used */
 tcg_gen_andi_i32(t0, arg2, 0x3F);
@@ -202,7 +202,7 @@ static inline void gen_op_evslw(TCGv_i32 ret, TCGv_i32 
arg1, TCGv_i32 arg2)
 {
 TCGLabel *l1 = gen_new_label();
 TCGLabel *l2 = gen_new_label();
-TCGv_i32 t0 = tcg_temp_local_new_i32();
+TCGv_i32 t0 = tcg_temp_new_i32();
 
 /* No error here: 6 bits are used */
 tcg_gen_andi_i32(t0, arg2, 0x3F);
@@ -378,7 +378,7 @@ static inline void gen_evsel(DisasContext *ctx)
 TCGLabel *l2 = gen_new_label();
 TCGLabel *l3 = gen_new_label();
 TCGLabel *l4 = gen_new_label();
-TCGv_i32 t0 = tcg_temp_local_new_i32();
+TCGv_i32 t0 = tcg_temp_new_i32();
 
 tcg_gen_andi_i32(t0, cpu_crf[ctx->opcode & 0x07], 1 << 3);
 tcg_gen_brcondi_i32(TCG_COND_EQ, t0, 0, l1);
diff --git a/target/ppc/translate/vmx-impl.c.inc 
b/target/ppc/translate/vmx-impl.c.inc
index 7741f2eb49..2dd17ab106 100644
--- a/target/ppc/translate/vmx-impl.c.inc
+++ b/target/ppc/translate/vmx-impl.c.inc
@@ -1508,8 +1508,8 @@ static bool do_vcmpq(DisasContext *ctx, arg_VX_bf *a, 
bool sign)
 REQUIRE_INSNS_FLAGS2(ctx, ISA310);
 REQUIRE_VECTOR(ctx);
 
-vra = tcg_temp_local_new_i64();
-vrb = tcg_temp_local_new_i64();
+vra = tcg_temp_new_i64();
+vrb = tcg_temp_new_i64();
 gt = gen_new_label();
 lt = gen_new_label();
 done = gen_new_label();
-- 
2.34.1

[PATCH v2 03/28] accel/tcg: Use more accurate max_insns for tb_overflow

2023-02-22 Thread Richard Henderson

Write back the number of insns that we attempt to translate,
so that if we longjmp out we have a more accurate limit for
the next attempt.  This results in fewer restarts when some
limit is consumed by few instructions.

Signed-off-by: Richard Henderson 
---
 accel/tcg/translator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index fac1e8c465..62e8f28025 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -78,7 +78,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int 
*max_insns,
 plugin_enabled = plugin_gen_tb_start(cpu, db, cflags & CF_MEMI_ONLY);
 
 while (true) {
-db->num_insns++;
+*max_insns = ++db->num_insns;
 ops->insn_start(db, cpu);
 tcg_debug_assert(db->is_jmp == DISAS_NEXT);  /* no early exit */
 
-- 
2.34.1

[PATCH v2 22/28] target/mips: Don't use tcg_temp_local_new

2023-02-22 Thread Richard Henderson

Since tcg_temp_new is now identical, use that.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/mips/tcg/translate.c  | 57 ++--
 target/mips/tcg/nanomips_translate.c.inc |  4 +-
 2 files changed, 16 insertions(+), 45 deletions(-)

diff --git a/target/mips/tcg/translate.c b/target/mips/tcg/translate.c
index bd70fcad25..8cad3d15a0 100644
--- a/target/mips/tcg/translate.c
+++ b/target/mips/tcg/translate.c
@@ -2400,7 +2400,7 @@ static void gen_arith_imm(DisasContext *ctx, uint32_t opc,
 switch (opc) {
 case OPC_ADDI:
 {
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
 TCGv t2 = tcg_temp_new();
 TCGLabel *l1 = gen_new_label();
@@ -2434,7 +2434,7 @@ static void gen_arith_imm(DisasContext *ctx, uint32_t opc,
 #if defined(TARGET_MIPS64)
 case OPC_DADDI:
 {
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
 TCGv t2 = tcg_temp_new();
 TCGLabel *l1 = gen_new_label();
@@ -2630,7 +2630,7 @@ static void gen_arith(DisasContext *ctx, uint32_t opc,
 switch (opc) {
 case OPC_ADD:
 {
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
 TCGv t2 = tcg_temp_new();
 TCGLabel *l1 = gen_new_label();
@@ -2666,7 +2666,7 @@ static void gen_arith(DisasContext *ctx, uint32_t opc,
 break;
 case OPC_SUB:
 {
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
 TCGv t2 = tcg_temp_new();
 TCGLabel *l1 = gen_new_label();
@@ -2707,7 +2707,7 @@ static void gen_arith(DisasContext *ctx, uint32_t opc,
 #if defined(TARGET_MIPS64)
 case OPC_DADD:
 {
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
 TCGv t2 = tcg_temp_new();
 TCGLabel *l1 = gen_new_label();
@@ -2741,7 +2741,7 @@ static void gen_arith(DisasContext *ctx, uint32_t opc,
 break;
 case OPC_DSUB:
 {
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 TCGv t1 = tcg_temp_new();
 TCGv t2 = tcg_temp_new();
 TCGLabel *l1 = gen_new_label();
@@ -3759,26 +3759,8 @@ static void gen_loongson_integer(DisasContext *ctx, 
uint32_t opc,
 return;
 }
 
-switch (opc) {
-case OPC_MULT_G_2E:
-case OPC_MULT_G_2F:
-case OPC_MULTU_G_2E:
-case OPC_MULTU_G_2F:
-#if defined(TARGET_MIPS64)
-case OPC_DMULT_G_2E:
-case OPC_DMULT_G_2F:
-case OPC_DMULTU_G_2E:
-case OPC_DMULTU_G_2F:
-#endif
-t0 = tcg_temp_new();
-t1 = tcg_temp_new();
-break;
-default:
-t0 = tcg_temp_local_new();
-t1 = tcg_temp_local_new();
-break;
-}
-
+t0 = tcg_temp_new();
+t1 = tcg_temp_new();
 gen_load_gpr(t0, rs);
 gen_load_gpr(t1, rt);
 
@@ -3955,21 +3937,10 @@ static void gen_loongson_multimedia(DisasContext *ctx, 
int rd, int rs, int rt)
 TCGCond cond;
 
 opc = MASK_LMMI(ctx->opcode);
-switch (opc) {
-case OPC_ADD_CP2:
-case OPC_SUB_CP2:
-case OPC_DADD_CP2:
-case OPC_DSUB_CP2:
-t0 = tcg_temp_local_new_i64();
-t1 = tcg_temp_local_new_i64();
-break;
-default:
-t0 = tcg_temp_new_i64();
-t1 = tcg_temp_new_i64();
-break;
-}
-
 check_cp1_enabled(ctx);
+
+t0 = tcg_temp_new_i64();
+t1 = tcg_temp_new_i64();
 gen_load_fpr64(ctx, t0, rs);
 gen_load_fpr64(ctx, t1, rt);
 
@@ -8650,7 +8621,7 @@ static void gen_mftr(CPUMIPSState *env, DisasContext 
*ctx, int rt, int rd,
  int u, int sel, int h)
 {
 int other_tc = env->CP0_VPEControl & (0xff << CP0VPECo_TargTC);
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 
 if ((env->CP0_VPEConf0 & (1 << CP0VPEC0_MVP)) == 0 &&
 ((env->tcs[other_tc].CP0_TCBind & (0xf << CP0TCBd_CurVPE)) !=
@@ -8878,7 +8849,7 @@ static void gen_mttr(CPUMIPSState *env, DisasContext 
*ctx, int rd, int rt,
  int u, int sel, int h)
 {
 int other_tc = env->CP0_VPEControl & (0xff << CP0VPECo_TargTC);
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 
 gen_load_gpr(t0, rt);
 if ((env->CP0_VPEConf0 & (1 << CP0VPEC0_MVP)) == 0 &&
@@ -11409,7 +11380,7 @@ static void gen_flt3_arith(DisasContext *ctx, uint32_t 
opc,
 case OPC_ALNV_PS:
 check_ps(ctx);
 {
-TCGv t0 = tcg_temp_local_new();
+TCGv t0 = tcg_temp_new();
 TCGv_i32 fp = tcg_temp_new_i32();
 TCGv_i32 fph = tcg_temp_new_i32();
 TCGLabel *l1 = gen_new_label();
diff --git a/target/mips/tcg/nanomi

[PATCH v2 28/28] tcg: Use noinline for major tcg_gen_code_subroutines

2023-02-22 Thread Richard Henderson

This makes it easier to assign blame with perf.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 4b244eebc2..b65f2ffdbe 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2619,7 +2619,8 @@ TCGOp *tcg_op_insert_after(TCGContext *s, TCGOp *old_op,
 }
 
 /* Reachable analysis : remove unreachable code.  */
-static void reachable_code_pass(TCGContext *s)
+static void __attribute__((noinline))
+reachable_code_pass(TCGContext *s)
 {
 TCGOp *op, *op_next, *op_prev;
 bool dead = false;
@@ -2840,7 +2841,8 @@ static void la_cross_call(TCGContext *s, int nt)
  * Liveness analysis: Verify the lifetime of TEMP_TB, and reduce
  * to TEMP_EBB, if possible.
  */
-static void liveness_pass_0(TCGContext *s)
+static void __attribute__((noinline))
+liveness_pass_0(TCGContext *s)
 {
 void * const multiple_ebb = (void *)(uintptr_t)-1;
 int nb_temps = s->nb_temps;
@@ -2907,7 +2909,8 @@ static void liveness_pass_0(TCGContext *s)
 /* Liveness analysis : update the opc_arg_life array to tell if a
given input arguments is dead. Instructions updating dead
temporaries are removed. */
-static void liveness_pass_1(TCGContext *s)
+static void __attribute__((noinline))
+liveness_pass_1(TCGContext *s)
 {
 int nb_globals = s->nb_globals;
 int nb_temps = s->nb_temps;
@@ -3247,7 +3250,8 @@ static void liveness_pass_1(TCGContext *s)
 }
 
 /* Liveness analysis: Convert indirect regs to direct temporaries.  */
-static bool liveness_pass_2(TCGContext *s)
+static bool __attribute__((noinline))
+liveness_pass_2(TCGContext *s)
 {
 int nb_globals = s->nb_globals;
 int nb_temps, i;
-- 
2.34.1

[PATCH v2 16/28] target/arm: Drop copies in gen_sve_{ldr,str}

2023-02-22 Thread Richard Henderson

Since we now get TEMP_TB temporaries by default, we no longer
need to make copies across these loops.  These were the only
uses of new_tmp_a64_local(), so remove that as well.

Signed-off-by: Richard Henderson 
---
 target/arm/translate-a64.h |  1 -
 target/arm/translate-a64.c |  6 --
 target/arm/translate-sve.c | 32 
 3 files changed, 39 deletions(-)

diff --git a/target/arm/translate-a64.h b/target/arm/translate-a64.h
index ad3762d1ac..ca24c39dbe 100644
--- a/target/arm/translate-a64.h
+++ b/target/arm/translate-a64.h
@@ -19,7 +19,6 @@
 #define TARGET_ARM_TRANSLATE_A64_H
 
 TCGv_i64 new_tmp_a64(DisasContext *s);
-TCGv_i64 new_tmp_a64_local(DisasContext *s);
 TCGv_i64 new_tmp_a64_zero(DisasContext *s);
 TCGv_i64 cpu_reg(DisasContext *s, int reg);
 TCGv_i64 cpu_reg_sp(DisasContext *s, int reg);
diff --git a/target/arm/translate-a64.c b/target/arm/translate-a64.c
index da9f877476..300248a0ad 100644
--- a/target/arm/translate-a64.c
+++ b/target/arm/translate-a64.c
@@ -436,12 +436,6 @@ TCGv_i64 new_tmp_a64(DisasContext *s)
 return s->tmp_a64[s->tmp_a64_count++] = tcg_temp_new_i64();
 }
 
-TCGv_i64 new_tmp_a64_local(DisasContext *s)
-{
-assert(s->tmp_a64_count < TMP_A64_MAX);
-return s->tmp_a64[s->tmp_a64_count++] = tcg_temp_local_new_i64();
-}
-
 TCGv_i64 new_tmp_a64_zero(DisasContext *s)
 {
 TCGv_i64 t = new_tmp_a64(s);
diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 621a2abb22..02150d93e8 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -4344,17 +4344,6 @@ void gen_sve_ldr(DisasContext *s, TCGv_ptr base, int 
vofs,
 TCGLabel *loop = gen_new_label();
 TCGv_ptr tp, i = tcg_const_local_ptr(0);
 
-/* Copy the clean address into a local temp, live across the loop. */
-t0 = clean_addr;
-clean_addr = new_tmp_a64_local(s);
-tcg_gen_mov_i64(clean_addr, t0);
-
-if (base != cpu_env) {
-TCGv_ptr b = tcg_temp_local_new_ptr();
-tcg_gen_mov_ptr(b, base);
-base = b;
-}
-
 gen_set_label(loop);
 
 t0 = tcg_temp_new_i64();
@@ -4370,11 +4359,6 @@ void gen_sve_ldr(DisasContext *s, TCGv_ptr base, int 
vofs,
 
 tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
 tcg_temp_free_ptr(i);
-
-if (base != cpu_env) {
-tcg_temp_free_ptr(base);
-assert(len_remain == 0);
-}
 }
 
 /*
@@ -4445,17 +4429,6 @@ void gen_sve_str(DisasContext *s, TCGv_ptr base, int 
vofs,
 TCGLabel *loop = gen_new_label();
 TCGv_ptr tp, i = tcg_const_local_ptr(0);
 
-/* Copy the clean address into a local temp, live across the loop. */
-t0 = clean_addr;
-clean_addr = new_tmp_a64_local(s);
-tcg_gen_mov_i64(clean_addr, t0);
-
-if (base != cpu_env) {
-TCGv_ptr b = tcg_temp_local_new_ptr();
-tcg_gen_mov_ptr(b, base);
-base = b;
-}
-
 gen_set_label(loop);
 
 t0 = tcg_temp_new_i64();
@@ -4471,11 +,6 @@ void gen_sve_str(DisasContext *s, TCGv_ptr base, int 
vofs,
 
 tcg_gen_brcondi_ptr(TCG_COND_LTU, i, len_align, loop);
 tcg_temp_free_ptr(i);
-
-if (base != cpu_env) {
-tcg_temp_free_ptr(base);
-assert(len_remain == 0);
-}
 }
 
 /* Predicate register stores can be any multiple of 2.  */
-- 
2.34.1

[PATCH v2 24/28] target/xtensa: Don't use tcg_temp_local_new_*

2023-02-22 Thread Richard Henderson

Since tcg_temp_new_* is now identical, use those.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/xtensa/translate.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/target/xtensa/translate.c b/target/xtensa/translate.c
index 8d7bf566de..4af0650deb 100644
--- a/target/xtensa/translate.c
+++ b/target/xtensa/translate.c
@@ -307,7 +307,7 @@ static void gen_right_shift_sar(DisasContext *dc, TCGv_i32 
sa)
 static void gen_left_shift_sar(DisasContext *dc, TCGv_i32 sa)
 {
 if (!dc->sar_m32_allocated) {
-dc->sar_m32 = tcg_temp_local_new_i32();
+dc->sar_m32 = tcg_temp_new_i32();
 dc->sar_m32_allocated = true;
 }
 tcg_gen_andi_i32(dc->sar_m32, sa, 0x1f);
@@ -1074,10 +1074,10 @@ static void disas_xtensa_insn(CPUXtensaState *env, 
DisasContext *dc)
 if (i == 0 || arg_copy[i].resource != resource) {
 resource = arg_copy[i].resource;
 if (arg_copy[i].arg->num_bits <= 32) {
-temp = tcg_temp_local_new_i32();
+temp = tcg_temp_new_i32();
 tcg_gen_mov_i32(temp, arg_copy[i].arg->in);
 } else if (arg_copy[i].arg->num_bits <= 64) {
-temp = tcg_temp_local_new_i64();
+temp = tcg_temp_new_i64();
 tcg_gen_mov_i64(temp, arg_copy[i].arg->in);
 } else {
 g_assert_not_reached();
@@ -1187,7 +1187,7 @@ static void xtensa_tr_tb_start(DisasContextBase *dcbase, 
CPUState *cpu)
 DisasContext *dc = container_of(dcbase, DisasContext, base);
 
 if (dc->icount) {
-dc->next_icount = tcg_temp_local_new_i32();
+dc->next_icount = tcg_temp_new_i32();
 }
 }
 
@@ -2273,8 +2273,8 @@ static void gen_check_atomctl(DisasContext *dc, TCGv_i32 
addr)
 static void translate_s32c1i(DisasContext *dc, const OpcodeArg arg[],
  const uint32_t par[])
 {
-TCGv_i32 tmp = tcg_temp_local_new_i32();
-TCGv_i32 addr = tcg_temp_local_new_i32();
+TCGv_i32 tmp = tcg_temp_new_i32();
+TCGv_i32 addr = tcg_temp_new_i32();
 MemOp mop;
 
 tcg_gen_mov_i32(tmp, arg[0].in);
@@ -2303,8 +2303,8 @@ static void translate_s32ex(DisasContext *dc, const 
OpcodeArg arg[],
 const uint32_t par[])
 {
 TCGv_i32 prev = tcg_temp_new_i32();
-TCGv_i32 addr = tcg_temp_local_new_i32();
-TCGv_i32 res = tcg_temp_local_new_i32();
+TCGv_i32 addr = tcg_temp_new_i32();
+TCGv_i32 res = tcg_temp_new_i32();
 TCGLabel *label = gen_new_label();
 MemOp mop;
 
-- 
2.34.1

[PATCH v2 15/28] tcg: Change default temp lifetime to TEMP_TB

2023-02-22 Thread Richard Henderson

Guest front-ends now get temps that span the lifetime of
the translation block by default, which avoids accidentally
using the temp across branches and invalidating the data.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 6cc6758cd6..2e220d4040 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -901,7 +901,7 @@ static inline TCGv_i32 tcg_temp_ebb_new_i32(void)
 
 static inline TCGv_i32 tcg_temp_new_i32(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, TEMP_EBB);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, TEMP_TB);
 return temp_tcgv_i32(t);
 }
 
@@ -927,7 +927,7 @@ static inline TCGv_i64 tcg_temp_ebb_new_i64(void)
 
 static inline TCGv_i64 tcg_temp_new_i64(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, TEMP_EBB);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, TEMP_TB);
 return temp_tcgv_i64(t);
 }
 
@@ -946,7 +946,7 @@ static inline TCGv_i128 tcg_temp_ebb_new_i128(void)
 
 static inline TCGv_i128 tcg_temp_new_i128(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, TEMP_EBB);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, TEMP_TB);
 return temp_tcgv_i128(t);
 }
 
@@ -972,7 +972,7 @@ static inline TCGv_ptr tcg_temp_ebb_new_ptr(void)
 
 static inline TCGv_ptr tcg_temp_new_ptr(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, TEMP_EBB);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, TEMP_TB);
 return temp_tcgv_ptr(t);
 }
 
-- 
2.34.1

[PATCH v2 10/28] tcg: Add tcg_gen_movi_ptr

2023-02-22 Thread Richard Henderson

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index 839d91c0c7..66b1461caa 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -1285,6 +1285,11 @@ static inline void tcg_gen_mov_ptr(TCGv_ptr d, TCGv_ptr 
s)
 glue(tcg_gen_mov_,PTR)((NAT)d, (NAT)s);
 }
 
+static inline void tcg_gen_movi_ptr(TCGv_ptr d, intptr_t s)
+{
+glue(tcg_gen_movi_,PTR)((NAT)d, s);
+}
+
 static inline void tcg_gen_brcondi_ptr(TCGCond cond, TCGv_ptr a,
intptr_t b, TCGLabel *label)
 {
-- 
2.34.1

[PATCH v2 17/28] target/arm: Don't use tcg_temp_local_new_*

2023-02-22 Thread Richard Henderson

Since tcg_temp_new_* is now identical, use those.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 target/arm/translate-sve.c | 6 +++---
 target/arm/translate.c | 6 +++---
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
index 02150d93e8..718a5bce1b 100644
--- a/target/arm/translate-sve.c
+++ b/target/arm/translate-sve.c
@@ -2694,7 +2694,7 @@ static bool do_clast_vector(DisasContext *s, arg_rprr_esz 
*a, bool before)
 return true;
 }
 
-last = tcg_temp_local_new_i32();
+last = tcg_temp_new_i32();
 over = gen_new_label();
 
 find_last_active(s, last, esz, a->pg);
@@ -4342,7 +4342,7 @@ void gen_sve_ldr(DisasContext *s, TCGv_ptr base, int vofs,
 tcg_temp_free_i64(t0);
 } else {
 TCGLabel *loop = gen_new_label();
-TCGv_ptr tp, i = tcg_const_local_ptr(0);
+TCGv_ptr tp, i = tcg_const_ptr(0);
 
 gen_set_label(loop);
 
@@ -4427,7 +4427,7 @@ void gen_sve_str(DisasContext *s, TCGv_ptr base, int vofs,
 tcg_temp_free_i64(t0);
 } else {
 TCGLabel *loop = gen_new_label();
-TCGv_ptr tp, i = tcg_const_local_ptr(0);
+TCGv_ptr tp, i = tcg_const_ptr(0);
 
 gen_set_label(loop);
 
diff --git a/target/arm/translate.c b/target/arm/translate.c
index 92955d505c..9c8e1ac04c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -7136,7 +7136,7 @@ static bool op_strex(DisasContext *s, arg_STREX *a, MemOp 
mop, bool rel)
 tcg_gen_mb(TCG_MO_ALL | TCG_BAR_STRL);
 }
 
-addr = tcg_temp_local_new_i32();
+addr = tcg_temp_new_i32();
 load_reg_var(s, addr, a->rn);
 tcg_gen_addi_i32(addr, addr, a->imm);
 
@@ -7289,7 +7289,7 @@ static bool op_ldrex(DisasContext *s, arg_LDREX *a, MemOp 
mop, bool acq)
 return true;
 }
 
-addr = tcg_temp_local_new_i32();
+addr = tcg_temp_new_i32();
 load_reg_var(s, addr, a->rn);
 tcg_gen_addi_i32(addr, addr, a->imm);
 
@@ -8696,7 +8696,7 @@ static bool trans_LE(DisasContext *s, arg_LE *a)
  * Decrement by 1 << (4 - LTPSIZE). We need to use a TCG local
  * so that decr stays live after the brcondi.
  */
-TCGv_i32 decr = tcg_temp_local_new_i32();
+TCGv_i32 decr = tcg_temp_new_i32();
 TCGv_i32 ltpsize = load_cpu_field(v7m.ltpsize);
 tcg_gen_sub_i32(decr, tcg_constant_i32(4), ltpsize);
 tcg_gen_shl_i32(decr, tcg_constant_i32(1), decr);
-- 
2.34.1

[PATCH v2 11/28] tcg: Use tcg_temp_ebb_new_* in tcg/

2023-02-22 Thread Richard Henderson

All of these have obvious and quite local scope.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 tcg/tcg-op-gvec.c | 270 +++---
 tcg/tcg-op.c  | 258 ++--
 tcg/tcg.c |   2 +-
 3 files changed, 265 insertions(+), 265 deletions(-)

diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index 079a761b04..d895011d6b 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -117,8 +117,8 @@ void tcg_gen_gvec_2_ool(uint32_t dofs, uint32_t aofs,
 TCGv_ptr a0, a1;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -138,8 +138,8 @@ void tcg_gen_gvec_2i_ool(uint32_t dofs, uint32_t aofs, 
TCGv_i64 c,
 TCGv_ptr a0, a1;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -158,9 +158,9 @@ void tcg_gen_gvec_3_ool(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 TCGv_ptr a0, a1, a2;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
-a2 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
+a2 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -181,10 +181,10 @@ void tcg_gen_gvec_4_ool(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 TCGv_ptr a0, a1, a2, a3;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
-a2 = tcg_temp_new_ptr();
-a3 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
+a2 = tcg_temp_ebb_new_ptr();
+a3 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -207,11 +207,11 @@ void tcg_gen_gvec_5_ool(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 TCGv_ptr a0, a1, a2, a3, a4;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
-a2 = tcg_temp_new_ptr();
-a3 = tcg_temp_new_ptr();
-a4 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
+a2 = tcg_temp_ebb_new_ptr();
+a3 = tcg_temp_ebb_new_ptr();
+a4 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -237,8 +237,8 @@ void tcg_gen_gvec_2_ptr(uint32_t dofs, uint32_t aofs,
 TCGv_ptr a0, a1;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -258,9 +258,9 @@ void tcg_gen_gvec_3_ptr(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 TCGv_ptr a0, a1, a2;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
-a2 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
+a2 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -283,10 +283,10 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 TCGv_ptr a0, a1, a2, a3;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
-a2 = tcg_temp_new_ptr();
-a3 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
+a2 = tcg_temp_ebb_new_ptr();
+a3 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -311,11 +311,11 @@ void tcg_gen_gvec_5_ptr(uint32_t dofs, uint32_t aofs, 
uint32_t bofs,
 TCGv_ptr a0, a1, a2, a3, a4;
 TCGv_i32 desc = tcg_constant_i32(simd_desc(oprsz, maxsz, data));
 
-a0 = tcg_temp_new_ptr();
-a1 = tcg_temp_new_ptr();
-a2 = tcg_temp_new_ptr();
-a3 = tcg_temp_new_ptr();
-a4 = tcg_temp_new_ptr();
+a0 = tcg_temp_ebb_new_ptr();
+a1 = tcg_temp_ebb_new_ptr();
+a2 = tcg_temp_ebb_new_ptr();
+a3 = tcg_temp_ebb_new_ptr();
+a4 = tcg_temp_ebb_new_ptr();
 
 tcg_gen_addi_ptr(a0, cpu_env, dofs);
 tcg_gen_addi_ptr(a1, cpu_env, aofs);
@@ -576,16 +576,16 @@ static void do_dup(unsigned vece, uint32_t dofs, uint32_t 
oprsz,
be simple enough.  */
 if

[PATCH v2 06/28] tcg: Add liveness_pass_0

2023-02-22 Thread Richard Henderson

Attempt to reduce the lifetime of TEMP_TB.

Signed-off-by: Richard Henderson 
---
 tcg/tcg.c | 69 +++
 1 file changed, 69 insertions(+)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index bf2af8b0fe..8d4ce7bd1e 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -2857,6 +2857,74 @@ static void la_cross_call(TCGContext *s, int nt)
 }
 }
 
+/*
+ * Liveness analysis: Verify the lifetime of TEMP_TB, and reduce
+ * to TEMP_EBB, if possible.
+ */
+static void liveness_pass_0(TCGContext *s)
+{
+void * const multiple_ebb = (void *)(uintptr_t)-1;
+int nb_temps = s->nb_temps;
+TCGOp *op, *ebb;
+
+for (int i = s->nb_globals; i < nb_temps; ++i) {
+s->temps[i].state_ptr = NULL;
+}
+
+/*
+ * Represent each EBB by the op at which it begins.  In the case of
+ * the first EBB, this is the first op, otherwise it is a label.
+ * Collect the uses of each TEMP_TB: NULL for unused, EBB for use
+ * within a single EBB, else MULTIPLE_EBB.
+ */
+ebb = QTAILQ_FIRST(&s->ops);
+QTAILQ_FOREACH(op, &s->ops, link) {
+const TCGOpDef *def;
+int nb_oargs, nb_iargs;
+
+switch (op->opc) {
+case INDEX_op_set_label:
+ebb = op;
+continue;
+case INDEX_op_discard:
+continue;
+case INDEX_op_call:
+nb_oargs = TCGOP_CALLO(op);
+nb_iargs = TCGOP_CALLI(op);
+break;
+default:
+def = &tcg_op_defs[op->opc];
+nb_oargs = def->nb_oargs;
+nb_iargs = def->nb_iargs;
+break;
+}
+
+for (int i = 0; i < nb_oargs + nb_iargs; ++i) {
+TCGTemp *ts = arg_temp(op->args[i]);
+
+if (ts->kind != TEMP_TB) {
+continue;
+}
+if (ts->state_ptr == NULL) {
+ts->state_ptr = ebb;
+} else if (ts->state_ptr != ebb) {
+ts->state_ptr = multiple_ebb;
+}
+}
+}
+
+/*
+ * For TEMP_TB that turned out not to be used beyond one EBB,
+ * reduce the liveness to TEMP_EBB.
+ */
+for (int i = s->nb_globals; i < nb_temps; ++i) {
+TCGTemp *ts = &s->temps[i];
+if (ts->kind == TEMP_TB && ts->state_ptr != multiple_ebb) {
+ts->kind = TEMP_EBB;
+}
+}
+}
+
 /* Liveness analysis : update the opc_arg_life array to tell if a
given input arguments is dead. Instructions updating dead
temporaries are removed. */
@@ -4870,6 +4938,7 @@ int tcg_gen_code(TCGContext *s, TranslationBlock *tb, 
target_ulong pc_start)
 #endif
 
 reachable_code_pass(s);
+liveness_pass_0(s);
 liveness_pass_1(s);
 
 if (s->nb_indirects > 0) {
-- 
2.34.1

[PATCH v2 07/28] tcg: Remove TEMP_NORMAL

2023-02-22 Thread Richard Henderson

TEMP_NORMAL is a subset of TEMP_EBB.  Promote single basic
block temps to single extended basic block.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h |  2 --
 tcg/tcg.c | 19 +++
 2 files changed, 3 insertions(+), 18 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 2010e746ca..02d5cfc049 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -431,8 +431,6 @@ typedef enum TCGTempVal {
 } TCGTempVal;
 
 typedef enum TCGTempKind {
-/* Temp is dead at the end of all basic blocks. */
-TEMP_NORMAL,
 /*
  * Temp is dead at the end of the extended basic block (EBB),
  * the single-entry multiple-exit region that falls through
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 8d4ce7bd1e..f52e9baf83 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1258,7 +1258,7 @@ TCGTemp *tcg_global_mem_new_internal(TCGType type, 
TCGv_ptr base,
 TCGTemp *tcg_temp_new_internal(TCGType type, bool temp_local)
 {
 TCGContext *s = tcg_ctx;
-TCGTempKind kind = temp_local ? TEMP_TB : TEMP_NORMAL;
+TCGTempKind kind = temp_local ? TEMP_TB : TEMP_EBB;
 TCGTemp *ts;
 int idx, k;
 
@@ -1368,7 +1368,7 @@ void tcg_temp_free_internal(TCGTemp *ts)
  * silently ignore free.
  */
 return;
-case TEMP_NORMAL:
+case TEMP_EBB:
 case TEMP_TB:
 break;
 default:
@@ -1384,7 +1384,7 @@ void tcg_temp_free_internal(TCGTemp *ts)
 #endif
 
 idx = temp_idx(ts);
-k = ts->base_type + (ts->kind == TEMP_NORMAL ? 0 : TCG_TYPE_COUNT);
+k = ts->base_type + (ts->kind == TEMP_EBB ? 0 : TCG_TYPE_COUNT);
 set_bit(idx, s->free_temps[k].l);
 }
 
@@ -1911,7 +1911,6 @@ static void tcg_reg_alloc_start(TCGContext *s)
 break;
 case TEMP_GLOBAL:
 break;
-case TEMP_NORMAL:
 case TEMP_EBB:
 val = TEMP_VAL_DEAD;
 /* fall through */
@@ -1941,9 +1940,6 @@ static char *tcg_get_arg_str_ptr(TCGContext *s, char 
*buf, int buf_size,
 snprintf(buf, buf_size, "loc%d", idx - s->nb_globals);
 break;
 case TEMP_EBB:
-snprintf(buf, buf_size, "ebb%d", idx - s->nb_globals);
-break;
-case TEMP_NORMAL:
 snprintf(buf, buf_size, "tmp%d", idx - s->nb_globals);
 break;
 case TEMP_CONST:
@@ -2762,7 +2758,6 @@ static void la_bb_end(TCGContext *s, int ng, int nt)
 case TEMP_TB:
 state = TS_DEAD | TS_MEM;
 break;
-case TEMP_NORMAL:
 case TEMP_EBB:
 case TEMP_CONST:
 state = TS_DEAD;
@@ -2811,9 +2806,6 @@ static void la_bb_sync(TCGContext *s, int ng, int nt)
 continue;
 }
 break;
-case TEMP_NORMAL:
-s->temps[i].state = TS_DEAD;
-break;
 case TEMP_EBB:
 case TEMP_CONST:
 continue;
@@ -3568,7 +3560,6 @@ static void temp_free_or_dead(TCGContext *s, TCGTemp *ts, 
int free_or_dead)
 case TEMP_TB:
 new_type = TEMP_VAL_MEM;
 break;
-case TEMP_NORMAL:
 case TEMP_EBB:
 new_type = free_or_dead < 0 ? TEMP_VAL_MEM : TEMP_VAL_DEAD;
 break;
@@ -3856,7 +3847,6 @@ static void tcg_reg_alloc_bb_end(TCGContext *s, TCGRegSet 
allocated_regs)
 case TEMP_TB:
 temp_save(s, ts, allocated_regs);
 break;
-case TEMP_NORMAL:
 case TEMP_EBB:
 /* The liveness analysis already ensures that temps are dead.
Keep an tcg_debug_assert for safety. */
@@ -3893,9 +3883,6 @@ static void tcg_reg_alloc_cbranch(TCGContext *s, 
TCGRegSet allocated_regs)
 case TEMP_TB:
 tcg_debug_assert(ts->val_type != TEMP_VAL_REG || ts->mem_coherent);
 break;
-case TEMP_NORMAL:
-tcg_debug_assert(ts->val_type == TEMP_VAL_DEAD);
-break;
 case TEMP_EBB:
 case TEMP_CONST:
 break;
-- 
2.34.1

[PATCH v2 02/28] accel/tcg: Pass max_insn to gen_intermediate_code by pointer

2023-02-22 Thread Richard Henderson

In preparation for returning the number of insns generated
via the same pointer.  Adjust only the prototypes so far.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/exec/translator.h | 4 ++--
 accel/tcg/translate-all.c | 2 +-
 accel/tcg/translator.c| 4 ++--
 target/alpha/translate.c  | 2 +-
 target/arm/translate.c| 2 +-
 target/avr/translate.c| 2 +-
 target/cris/translate.c   | 2 +-
 target/hexagon/translate.c| 2 +-
 target/hppa/translate.c   | 2 +-
 target/i386/tcg/translate.c   | 2 +-
 target/loongarch/translate.c  | 2 +-
 target/m68k/translate.c   | 2 +-
 target/microblaze/translate.c | 2 +-
 target/mips/tcg/translate.c   | 2 +-
 target/nios2/translate.c  | 2 +-
 target/openrisc/translate.c   | 2 +-
 target/ppc/translate.c| 2 +-
 target/riscv/translate.c  | 2 +-
 target/rx/translate.c | 2 +-
 target/s390x/tcg/translate.c  | 2 +-
 target/sh4/translate.c| 2 +-
 target/sparc/translate.c  | 2 +-
 target/tricore/translate.c| 2 +-
 target/xtensa/translate.c | 2 +-
 24 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/include/exec/translator.h b/include/exec/translator.h
index af2ff95cd5..8b36690e80 100644
--- a/include/exec/translator.h
+++ b/include/exec/translator.h
@@ -37,7 +37,7 @@
  * This function must be provided by the target, which should create
  * the target-specific DisasContext, and then invoke translator_loop.
  */
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int *max_insns,
target_ulong pc, void *host_pc);
 
 /**
@@ -146,7 +146,7 @@ typedef struct TranslatorOps {
  * - When single-stepping is enabled (system-wide or on the current vCPU).
  * - When too many instructions have been translated.
  */
-void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
+void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
  target_ulong pc, void *host_pc,
  const TranslatorOps *ops, DisasContextBase *db);
 
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index 9e925c10f3..b7b361959e 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -281,7 +281,7 @@ static int setjmp_gen_code(CPUArchState *env, 
TranslationBlock *tb,
 tcg_func_start(tcg_ctx);
 
 tcg_ctx->cpu = env_cpu(env);
-gen_intermediate_code(env_cpu(env), tb, *max_insns, pc, host_pc);
+gen_intermediate_code(env_cpu(env), tb, max_insns, pc, host_pc);
 assert(tb->size != 0);
 tcg_ctx->cpu = NULL;
 *max_insns = tb->icount;
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index 1cf404ced0..fac1e8c465 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -42,7 +42,7 @@ bool translator_use_goto_tb(DisasContextBase *db, 
target_ulong dest)
 return ((db->pc_first ^ dest) & TARGET_PAGE_MASK) == 0;
 }
 
-void translator_loop(CPUState *cpu, TranslationBlock *tb, int max_insns,
+void translator_loop(CPUState *cpu, TranslationBlock *tb, int *max_insns,
  target_ulong pc, void *host_pc,
  const TranslatorOps *ops, DisasContextBase *db)
 {
@@ -55,7 +55,7 @@ void translator_loop(CPUState *cpu, TranslationBlock *tb, int 
max_insns,
 db->pc_next = pc;
 db->is_jmp = DISAS_NEXT;
 db->num_insns = 0;
-db->max_insns = max_insns;
+db->max_insns = *max_insns;
 db->singlestep_enabled = cflags & CF_SINGLE_STEP;
 db->host_addr[0] = host_pc;
 db->host_addr[1] = NULL;
diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index f9bcdeb717..716b083f39 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -3043,7 +3043,7 @@ static const TranslatorOps alpha_tr_ops = {
 .disas_log  = alpha_tr_disas_log,
 };
 
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int *max_insns,
target_ulong pc, void *host_pc)
 {
 DisasContext dc;
diff --git a/target/arm/translate.c b/target/arm/translate.c
index c23a3462bf..92955d505c 100644
--- a/target/arm/translate.c
+++ b/target/arm/translate.c
@@ -9970,7 +9970,7 @@ static const TranslatorOps thumb_translator_ops = {
 };
 
 /* generate intermediate code for basic block 'tb'.  */
-void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int max_insns,
+void gen_intermediate_code(CPUState *cpu, TranslationBlock *tb, int *max_insns,
target_ulong pc, void *host_pc)
 {
 DisasContext dc = { };
diff --git a/target/avr/translate.c b/target/avr/translate.c
index 2bed56f135..e40d8e9681 100644
--- a/target/avr/translate.c
+++ b/target/avr/translate.c
@@ -3049,7 +3049,7 @@ static const TranslatorOps avr_tr_ops = {
 .disas_log  = avr_tr_disa

[PATCH v2 14/28] tcg: Don't re-use TEMP_TB temporaries

2023-02-22 Thread Richard Henderson

Reusing TEMP_TB interferes with detecting whether the
temp can be adjusted to TEMP_EBB.

Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h |   2 +-
 tcg/tcg.c | 101 --
 2 files changed, 53 insertions(+), 50 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 0c2041bcf7..6cc6758cd6 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -612,7 +612,7 @@ struct TCGContext {
 #endif
 
 GHashTable *const_table[TCG_TYPE_COUNT];
-TCGTempSet free_temps[TCG_TYPE_COUNT * 2];
+TCGTempSet free_temps[TCG_TYPE_COUNT];
 TCGTemp temps[TCG_MAX_TEMPS]; /* globals first, temps after */
 
 QTAILQ_HEAD(, TCGOp) ops, free_ops;
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 06ac9d5ab8..9f1b042ecd 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1258,63 +1258,66 @@ TCGTemp *tcg_global_mem_new_internal(TCGType type, 
TCGv_ptr base,
 TCGTemp *tcg_temp_new_internal(TCGType type, TCGTempKind kind)
 {
 TCGContext *s = tcg_ctx;
-bool temp_local = kind == TEMP_TB;
 TCGTemp *ts;
-int idx, k;
+int n;
 
-k = type + (temp_local ? TCG_TYPE_COUNT : 0);
-idx = find_first_bit(s->free_temps[k].l, TCG_MAX_TEMPS);
-if (idx < TCG_MAX_TEMPS) {
-/* There is already an available temp with the right type.  */
-clear_bit(idx, s->free_temps[k].l);
+if (kind == TEMP_EBB) {
+int idx = find_first_bit(s->free_temps[type].l, TCG_MAX_TEMPS);
 
-ts = &s->temps[idx];
-ts->temp_allocated = 1;
-tcg_debug_assert(ts->base_type == type);
-tcg_debug_assert(ts->kind == kind);
-} else {
-int i, n;
+if (idx < TCG_MAX_TEMPS) {
+/* There is already an available temp with the right type.  */
+clear_bit(idx, s->free_temps[type].l);
 
-switch (type) {
-case TCG_TYPE_I32:
-case TCG_TYPE_V64:
-case TCG_TYPE_V128:
-case TCG_TYPE_V256:
-n = 1;
-break;
-case TCG_TYPE_I64:
-n = 64 / TCG_TARGET_REG_BITS;
-break;
-case TCG_TYPE_I128:
-n = 128 / TCG_TARGET_REG_BITS;
-break;
-default:
-g_assert_not_reached();
+ts = &s->temps[idx];
+ts->temp_allocated = 1;
+tcg_debug_assert(ts->base_type == type);
+tcg_debug_assert(ts->kind == kind);
+goto done;
 }
+} else {
+tcg_debug_assert(kind == TEMP_TB);
+}
 
-ts = tcg_temp_alloc(s);
-ts->base_type = type;
-ts->temp_allocated = 1;
-ts->kind = kind;
+switch (type) {
+case TCG_TYPE_I32:
+case TCG_TYPE_V64:
+case TCG_TYPE_V128:
+case TCG_TYPE_V256:
+n = 1;
+break;
+case TCG_TYPE_I64:
+n = 64 / TCG_TARGET_REG_BITS;
+break;
+case TCG_TYPE_I128:
+n = 128 / TCG_TARGET_REG_BITS;
+break;
+default:
+g_assert_not_reached();
+}
 
-if (n == 1) {
-ts->type = type;
-} else {
-ts->type = TCG_TYPE_REG;
+ts = tcg_temp_alloc(s);
+ts->base_type = type;
+ts->temp_allocated = 1;
+ts->kind = kind;
 
-for (i = 1; i < n; ++i) {
-TCGTemp *ts2 = tcg_temp_alloc(s);
+if (n == 1) {
+ts->type = type;
+} else {
+ts->type = TCG_TYPE_REG;
 
-tcg_debug_assert(ts2 == ts + i);
-ts2->base_type = type;
-ts2->type = TCG_TYPE_REG;
-ts2->temp_allocated = 1;
-ts2->temp_subindex = i;
-ts2->kind = kind;
-}
+for (int i = 1; i < n; ++i) {
+TCGTemp *ts2 = tcg_temp_alloc(s);
+
+tcg_debug_assert(ts2 == ts + i);
+ts2->base_type = type;
+ts2->type = TCG_TYPE_REG;
+ts2->temp_allocated = 1;
+ts2->temp_subindex = i;
+ts2->kind = kind;
 }
 }
 
+ done:
 #if defined(CONFIG_DEBUG_TCG)
 s->temps_in_use++;
 #endif
@@ -1359,7 +1362,6 @@ TCGv_vec tcg_temp_new_vec_matching(TCGv_vec match)
 void tcg_temp_free_internal(TCGTemp *ts)
 {
 TCGContext *s = tcg_ctx;
-int k, idx;
 
 switch (ts->kind) {
 case TEMP_CONST:
@@ -1383,9 +1385,10 @@ void tcg_temp_free_internal(TCGTemp *ts)
 s->temps_in_use--;
 #endif
 
-idx = temp_idx(ts);
-k = ts->base_type + (ts->kind == TEMP_EBB ? 0 : TCG_TYPE_COUNT);
-set_bit(idx, s->free_temps[k].l);
+if (ts->kind == TEMP_EBB) {
+int idx = temp_idx(ts);
+set_bit(idx, s->free_temps[ts->base_type].l);
+}
 }
 
 TCGTemp *tcg_constant_internal(TCGType type, int64_t val)
-- 
2.34.1

[PATCH v2 09/28] tcg: Add tcg_temp_ebb_new_{i32,i64,ptr}

2023-02-22 Thread Richard Henderson

TCG internals will want to be able to allocate and reuse
explicitly life-limited temporaries.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h | 28 
 1 file changed, 28 insertions(+)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 8d896bcbf4..0c2041bcf7 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -892,6 +892,13 @@ static inline TCGv_i32 tcg_global_mem_new_i32(TCGv_ptr 
reg, intptr_t offset,
 return temp_tcgv_i32(t);
 }
 
+/* Used only by tcg infrastructure: tcg-op.c or plugin-gen.c */
+static inline TCGv_i32 tcg_temp_ebb_new_i32(void)
+{
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, TEMP_EBB);
+return temp_tcgv_i32(t);
+}
+
 static inline TCGv_i32 tcg_temp_new_i32(void)
 {
 TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, TEMP_EBB);
@@ -911,6 +918,13 @@ static inline TCGv_i64 tcg_global_mem_new_i64(TCGv_ptr 
reg, intptr_t offset,
 return temp_tcgv_i64(t);
 }
 
+/* Used only by tcg infrastructure: tcg-op.c or plugin-gen.c */
+static inline TCGv_i64 tcg_temp_ebb_new_i64(void)
+{
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, TEMP_EBB);
+return temp_tcgv_i64(t);
+}
+
 static inline TCGv_i64 tcg_temp_new_i64(void)
 {
 TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, TEMP_EBB);
@@ -923,6 +937,13 @@ static inline TCGv_i64 tcg_temp_local_new_i64(void)
 return temp_tcgv_i64(t);
 }
 
+/* Used only by tcg infrastructure: tcg-op.c or plugin-gen.c */
+static inline TCGv_i128 tcg_temp_ebb_new_i128(void)
+{
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, TEMP_EBB);
+return temp_tcgv_i128(t);
+}
+
 static inline TCGv_i128 tcg_temp_new_i128(void)
 {
 TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, TEMP_EBB);
@@ -942,6 +963,13 @@ static inline TCGv_ptr tcg_global_mem_new_ptr(TCGv_ptr 
reg, intptr_t offset,
 return temp_tcgv_ptr(t);
 }
 
+/* Used only by tcg infrastructure: tcg-op.c or plugin-gen.c */
+static inline TCGv_ptr tcg_temp_ebb_new_ptr(void)
+{
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, TEMP_EBB);
+return temp_tcgv_ptr(t);
+}
+
 static inline TCGv_ptr tcg_temp_new_ptr(void)
 {
 TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, TEMP_EBB);
-- 
2.34.1

[PATCH v2 01/28] tcg: Adjust TCGContext.temps_in_use check

2023-02-22 Thread Richard Henderson

Change the temps_in_use check to use assert not fprintf.
Move the assert for double-free before the check for count,
since that is the more immediate problem.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 tcg/tcg.c | 12 +---
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a4a3da6804..06209e6160 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1375,16 +1375,14 @@ void tcg_temp_free_internal(TCGTemp *ts)
 g_assert_not_reached();
 }
 
-#if defined(CONFIG_DEBUG_TCG)
-s->temps_in_use--;
-if (s->temps_in_use < 0) {
-fprintf(stderr, "More temporaries freed than allocated!\n");
-}
-#endif
-
 tcg_debug_assert(ts->temp_allocated != 0);
 ts->temp_allocated = 0;
 
+#if defined(CONFIG_DEBUG_TCG)
+assert(s->temps_in_use > 0);
+s->temps_in_use--;
+#endif
+
 idx = temp_idx(ts);
 k = ts->base_type + (ts->kind == TEMP_NORMAL ? 0 : TCG_TYPE_COUNT);
 set_bit(idx, s->free_temps[k].l);
-- 
2.34.1

[PATCH v2 00/28] tcg: Simplify temporary usage

2023-02-22 Thread Richard Henderson

The biggest pitfall for new users of TCG is the fact that "normal"
temporaries die at branches, and we must therefore use a different
"local" temporary in that case.

The following patch set changes that, so that the "normal" temporary
is the one that lives across branches, and there is a special temporary
that dies at the end of the extended basic block, and this special
case is reserved for tcg internals.

Patches lacking review:
  03-accel-tcg-Use-more-accurate-max_insns-for-tb_over.patch
  04-tcg-Remove-branch-to-next-regardless-of-reference.patch
  06-tcg-Add-liveness_pass_0.patch
  14-tcg-Don-t-re-use-TEMP_TB-temporaries.patch
  16-target-arm-Drop-copies-in-gen_sve_-ldr-str.patch
  27-tcg-Update-docs-devel-tcg-ops.rst-for-temporary-c.patch
  28-tcg-Use-noinline-for-major-tcg_gen_code_subroutin.patch


r~


Richard Henderson (28):
  tcg: Adjust TCGContext.temps_in_use check
  accel/tcg: Pass max_insn to gen_intermediate_code by pointer
  accel/tcg: Use more accurate max_insns for tb_overflow
  tcg: Remove branch-to-next regardless of reference count
  tcg: Rename TEMP_LOCAL to TEMP_TB
  tcg: Add liveness_pass_0
  tcg: Remove TEMP_NORMAL
  tcg: Pass TCGTempKind to tcg_temp_new_internal
  tcg: Add tcg_temp_ebb_new_{i32,i64,ptr}
  tcg: Add tcg_gen_movi_ptr
  tcg: Use tcg_temp_ebb_new_* in tcg/
  accel/tcg/plugin: Use tcg_temp_ebb_*
  accel/tcg/plugin: Tidy plugin_gen_disable_mem_helpers
  tcg: Don't re-use TEMP_TB temporaries
  tcg: Change default temp lifetime to TEMP_TB
  target/arm: Drop copies in gen_sve_{ldr,str}
  target/arm: Don't use tcg_temp_local_new_*
  target/cris: Don't use tcg_temp_local_new
  target/hexagon: Don't use tcg_temp_local_new_*
  target/hppa: Don't use tcg_temp_local_new
  target/i386: Don't use tcg_temp_local_new
  target/mips: Don't use tcg_temp_local_new
  target/ppc: Don't use tcg_temp_local_new
  target/xtensa: Don't use tcg_temp_local_new_*
  exec/gen-icount: Don't use tcg_temp_local_new_i32
  tcg: Remove tcg_temp_local_new_*, tcg_const_local_*
  tcg: Update docs/devel/tcg-ops.rst for temporary changes
  tcg: Use noinline for major tcg_gen_code_subroutines

 docs/devel/tcg-ops.rst  | 103 +++
 target/hexagon/idef-parser/README.rst   |   4 +-
 include/exec/gen-icount.h   |   8 +-
 include/exec/translator.h   |   4 +-
 include/tcg/tcg-op.h|   7 +-
 include/tcg/tcg.h   |  64 ++---
 target/arm/translate-a64.h  |   1 -
 target/hexagon/gen_tcg.h|   4 +-
 accel/tcg/plugin-gen.c  |  32 +--
 accel/tcg/translate-all.c   |   2 +-
 accel/tcg/translator.c  |   6 +-
 target/alpha/translate.c|   2 +-
 target/arm/translate-a64.c  |   6 -
 target/arm/translate-sve.c  |  38 +--
 target/arm/translate.c  |   8 +-
 target/avr/translate.c  |   2 +-
 target/cris/translate.c |   8 +-
 target/hexagon/genptr.c |  16 +-
 target/hexagon/idef-parser/parser-helpers.c |   4 +-
 target/hexagon/translate.c  |   4 +-
 target/hppa/translate.c |   5 +-
 target/i386/tcg/translate.c |  29 +-
 target/loongarch/translate.c|   2 +-
 target/m68k/translate.c |   2 +-
 target/microblaze/translate.c   |   2 +-
 target/mips/tcg/translate.c |  59 ++---
 target/nios2/translate.c|   2 +-
 target/openrisc/translate.c |   2 +-
 target/ppc/translate.c  |   8 +-
 target/riscv/translate.c|   2 +-
 target/rx/translate.c   |   2 +-
 target/s390x/tcg/translate.c|   2 +-
 target/sh4/translate.c  |   2 +-
 target/sparc/translate.c|   2 +-
 target/tricore/translate.c  |   2 +-
 target/xtensa/translate.c   |  18 +-
 tcg/optimize.c  |   2 +-
 tcg/tcg-op-gvec.c   | 270 +--
 tcg/tcg-op.c| 258 +-
 tcg/tcg.c   | 280 
 target/cris/translate_v10.c.inc |  10 +-
 target/mips/tcg/nanomips_translate.c.inc|   4 +-
 target/ppc/translate/spe-impl.c.inc |   8 +-
 target/ppc/translate/vmx-impl.c.inc |   4 +-
 target/hexagon/README   |   8 +-
 target/hexagon/gen_tcg_funcs.py |  18 +-
 46 files changed, 646 insertions(+), 680 deletions(-)

-- 
2.34.1

[PATCH v2 13/28] accel/tcg/plugin: Tidy plugin_gen_disable_mem_helpers

2023-02-22 Thread Richard Henderson

Here we are creating a temp whose value needs to be replaced,
but always storing NULL into CPUState.plugin_mem_cbs.
Use tcg_constant_ptr(0) explicitly.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 9b793ac62c..c42a436c0c 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -630,8 +630,6 @@ static void inject_mem_disable_helper(struct 
qemu_plugin_insn *plugin_insn,
 /* called before finishing a TB with exit_tb, goto_tb or goto_ptr */
 void plugin_gen_disable_mem_helpers(void)
 {
-TCGv_ptr ptr;
-
 /*
  * We could emit the clearing unconditionally and be done. However, this 
can
  * be wasteful if for instance plugins don't track memory accesses, or if
@@ -644,10 +642,8 @@ void plugin_gen_disable_mem_helpers(void)
 if (!tcg_ctx->plugin_tb->mem_helper) {
 return;
 }
-ptr = tcg_const_ptr(NULL);
-tcg_gen_st_ptr(ptr, cpu_env, offsetof(CPUState, plugin_mem_cbs) -
- offsetof(ArchCPU, env));
-tcg_temp_free_ptr(ptr);
+tcg_gen_st_ptr(tcg_constant_ptr(NULL), cpu_env,
+   offsetof(CPUState, plugin_mem_cbs) - offsetof(ArchCPU, 
env));
 }
 
 static void plugin_gen_tb_udata(const struct qemu_plugin_tb *ptb,
-- 
2.34.1

[PATCH v2 08/28] tcg: Pass TCGTempKind to tcg_temp_new_internal

2023-02-22 Thread Richard Henderson

While the argument can only be TEMP_EBB or TEMP_TB,
it's more obvious this way.

Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg.h | 18 +-
 tcg/tcg.c |  8 
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index 02d5cfc049..8d896bcbf4 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -855,7 +855,7 @@ void tcg_set_frame(TCGContext *s, TCGReg reg, intptr_t 
start, intptr_t size);
 
 TCGTemp *tcg_global_mem_new_internal(TCGType, TCGv_ptr,
  intptr_t, const char *);
-TCGTemp *tcg_temp_new_internal(TCGType, bool);
+TCGTemp *tcg_temp_new_internal(TCGType, TCGTempKind);
 void tcg_temp_free_internal(TCGTemp *);
 TCGv_vec tcg_temp_new_vec(TCGType type);
 TCGv_vec tcg_temp_new_vec_matching(TCGv_vec match);
@@ -894,13 +894,13 @@ static inline TCGv_i32 tcg_global_mem_new_i32(TCGv_ptr 
reg, intptr_t offset,
 
 static inline TCGv_i32 tcg_temp_new_i32(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, false);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, TEMP_EBB);
 return temp_tcgv_i32(t);
 }
 
 static inline TCGv_i32 tcg_temp_local_new_i32(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, true);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I32, TEMP_TB);
 return temp_tcgv_i32(t);
 }
 
@@ -913,25 +913,25 @@ static inline TCGv_i64 tcg_global_mem_new_i64(TCGv_ptr 
reg, intptr_t offset,
 
 static inline TCGv_i64 tcg_temp_new_i64(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, false);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, TEMP_EBB);
 return temp_tcgv_i64(t);
 }
 
 static inline TCGv_i64 tcg_temp_local_new_i64(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, true);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I64, TEMP_TB);
 return temp_tcgv_i64(t);
 }
 
 static inline TCGv_i128 tcg_temp_new_i128(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, false);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, TEMP_EBB);
 return temp_tcgv_i128(t);
 }
 
 static inline TCGv_i128 tcg_temp_local_new_i128(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, true);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_I128, TEMP_TB);
 return temp_tcgv_i128(t);
 }
 
@@ -944,13 +944,13 @@ static inline TCGv_ptr tcg_global_mem_new_ptr(TCGv_ptr 
reg, intptr_t offset,
 
 static inline TCGv_ptr tcg_temp_new_ptr(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, false);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, TEMP_EBB);
 return temp_tcgv_ptr(t);
 }
 
 static inline TCGv_ptr tcg_temp_local_new_ptr(void)
 {
-TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, true);
+TCGTemp *t = tcg_temp_new_internal(TCG_TYPE_PTR, TEMP_TB);
 return temp_tcgv_ptr(t);
 }
 
diff --git a/tcg/tcg.c b/tcg/tcg.c
index f52e9baf83..bbae9d493b 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1255,10 +1255,10 @@ TCGTemp *tcg_global_mem_new_internal(TCGType type, 
TCGv_ptr base,
 return ts;
 }
 
-TCGTemp *tcg_temp_new_internal(TCGType type, bool temp_local)
+TCGTemp *tcg_temp_new_internal(TCGType type, TCGTempKind kind)
 {
 TCGContext *s = tcg_ctx;
-TCGTempKind kind = temp_local ? TEMP_TB : TEMP_EBB;
+bool temp_local = kind == TEMP_TB;
 TCGTemp *ts;
 int idx, k;
 
@@ -1341,7 +1341,7 @@ TCGv_vec tcg_temp_new_vec(TCGType type)
 }
 #endif
 
-t = tcg_temp_new_internal(type, 0);
+t = tcg_temp_new_internal(type, TEMP_EBB);
 return temp_tcgv_vec(t);
 }
 
@@ -1352,7 +1352,7 @@ TCGv_vec tcg_temp_new_vec_matching(TCGv_vec match)
 
 tcg_debug_assert(t->temp_allocated != 0);
 
-t = tcg_temp_new_internal(t->base_type, 0);
+t = tcg_temp_new_internal(t->base_type, TEMP_EBB);
 return temp_tcgv_vec(t);
 }
 
-- 
2.34.1

Re: [PATCH 0/5] Pegasos2 fixes and audio output support

2023-02-22 Thread BALATON Zoltan


On Wed, 22 Feb 2023, Bernhard Beschow wrote:

Am 22. Februar 2023 21:12:01 UTC schrieb BALATON Zoltan :

On Wed, 22 Feb 2023, Bernhard Beschow wrote:

Am 22. Februar 2023 19:25:16 UTC schrieb BALATON Zoltan :

On Wed, 22 Feb 2023, Bernhard Beschow wrote:

On Wed, Feb 22, 2023 at 4:38 PM Bernhard Beschow  wrote:

I've had a closer look at your series and I think it can be simplified:
Patch 2 can be implemented quite straight-forward like I proposed in a
private mail: https://github.com/shentok/qemu/commit/via-priq-routing.
Then, in order to make patch 3 "hw/ppc/pegasos2: Fix PCI interrupt routing"
working, one can expose the PCI interrupts with a single line like you do
in patch 2. With this, patch 1 "hw/isa/vt82c686: Implement interrupt
routing in via_isa_set_irq" isn't needed any longer and can be omitted.

In via-ac97, rather than using via_isa_set_irq(), pci_set_irq() can be
used instead. pci_set_irq() internally takes care of all the ISA interrupt
level tracking patch 1 attempted to address.



Here is a proof of concept branch to demonstrate that the simplification
actually works: https://github.com/shentok/qemu/commits/pegasos2 (Tested
with MorphOS with and without pegasos2.rom).


Does this only work because both the via-ac97 and the PCI interrupts are mapped 
to the same ISA IRQ and you've only tested sound? The guest could configure 
each device to use a different IRQ, also mapping them so they share one ISA 
interrupt. What happens if multiple devices are mapped to IRQ 9 (which is the 
case on pegasos2 where PCI cards, ac97 and USB all share this IRQ) and more 
than one such device wants to raise an interrupt at the same time? If you ack 
the ac97 interrupt but a PCI network card or the USB part still wants to get 
the CPUs attention the ISA IRQ should remain raised until all devices are 
serviced.


pci_bus_get_irq_level(), used in via_isa_set_pci_irq(), should handle
exactly that case very well.


I don't see a way to track the status of all devices in a single qemu_irq which 
can only be up or down so we need something to store the state of each source.


pci_set_irq() causes pci_bus_change_irq_level() to be called.
pci_bus_change_irq_level() tracks the sum of all irq levels of all
devices attached to a particular pin in irq_count. Have a look at
pci_bus_change_irq_level() and you will understand better.


I'm aware of that, we're using that in sam460ex which connects all PCI 
interrupt lines to a single IRQ and Peter explored and explained it in 
a comment there when that was discovered. First we had a patch with 
or-irq but due to this behaviot that's not needed for PCI interrupts. 
But the VT8132 could change what ISA IRQ you route the sub functions 
to.


That depends on the sub function if you can do that. And if so, then it 
depends on whether the function is still in PCI mode (see below).



It happens that on pegasos2 by default all of those are routed to IRQ9 except 
IDE


All *PCI* interrupts are routed to IRQ9 while IDE legacy interrupts are 
routed to the compatible ISA IRQs. Note that the IDE function must only 
trigger the ISA IRQs if it is in legacy mode while it must only trigger 
the PCI IRQ in non-legacy mode. See https://www.bswd.com/pciide.pdf for 
more details on this particular topic.


The docs say so but based on what guests that work on real hardware do it 
does not work that way. Look up previous discussion on this on the list 
from around the time Mark changed via-ide about 4-5 years ago. That series 
was a result of his review of my proposed changes and gave resuled in an 
alternative appdroach. On pegasos2 (and probably also on fuloong2e based 
on same later findings, see patches to that, I can try to find these later 
if you can't find them) via-ide *always* uses IRQ 14/15 and the native 
mode only switches register addresses from legacy io ports to PCI io space 
so you can set it in with BAR regs but the IRQs don't change despite what 
the docs say. There are some hacks in Linux kernel and other guests to 
account for this but the comments for the reason are wrong in Linux, they 
say IDE is always in legacy mode but in fact if has a half-native mode 
which is what I called it where io addresses are set with BARs but IRQs 
are still the legacy ISA ones. You can find some references in previous 
discussion. Probably searching for via-ide half-native mode might find it.


but what if a guest changes ac97 to use a different interrupt? Then 
it's not a PCI interrupt any more so you can't use pci_set_irq in 
via=ac97.


How would it do that? AFAICS there is no dedicated register to configure 
which IRQ to use. This means that it can only trigger an interrupt via 
its PCI intx pin which is subject to the PCI -> ISA IRQ router.


The VIA functions can use their PCI_INTERRUPT_LINE (0x3c) registers to set 
their ISA IRQ according to the docs (and unlike IDE in other functions 
like USB and sound this probably also works) and the PIRQA-D pins can be 
mapped to ISA IRQs by

Re: [PATCH v2 11/20] vfio/common: Add device dirty page tracking start/stop

2023-02-22 Thread Alex Williamson

On Wed, 22 Feb 2023 19:49:06 +0200
Avihai Horon  wrote:

> From: Joao Martins 
> 
> Add device dirty page tracking start/stop functionality. This uses the
> device DMA logging uAPI to start and stop dirty page tracking by device.
> 
> Device dirty page tracking is used only if all devices within a
> container support device dirty page tracking.
> 
> Signed-off-by: Joao Martins 
> Signed-off-by: Avihai Horon 
> ---
>  include/hw/vfio/vfio-common.h |   2 +
>  hw/vfio/common.c  | 211 +-
>  2 files changed, 211 insertions(+), 2 deletions(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index 6f36876ce0..1f21e1fa43 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -149,6 +149,8 @@ typedef struct VFIODevice {
>  VFIOMigration *migration;
>  Error *migration_blocker;
>  OnOffAuto pre_copy_dirty_page_tracking;
> +bool dirty_pages_supported;
> +bool dirty_tracking;
>  } VFIODevice;
>  
>  struct VFIODeviceOps {
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 6041da6c7e..740153e7d7 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -473,6 +473,22 @@ static bool 
> vfio_devices_all_dirty_tracking(VFIOContainer *container)
>  return true;
>  }
>  
> +static bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container)
> +{
> +VFIOGroup *group;
> +VFIODevice *vbasedev;
> +
> +QLIST_FOREACH(group, &container->group_list, container_next) {
> +QLIST_FOREACH(vbasedev, &group->device_list, next) {
> +if (!vbasedev->dirty_pages_supported) {
> +return false;
> +}
> +}
> +}
> +
> +return true;
> +}
> +
>  /*
>   * Check if all VFIO devices are running and migration is active, which is
>   * essentially equivalent to the migration being in pre-copy phase.
> @@ -1404,13 +1420,192 @@ static int 
> vfio_set_dirty_page_tracking(VFIOContainer *container, bool start)
>  return ret;
>  }
>  
> +static int vfio_devices_dma_logging_set(VFIOContainer *container,
> +struct vfio_device_feature *feature)
> +{
> +bool status = (feature->flags & VFIO_DEVICE_FEATURE_MASK) ==
> +  VFIO_DEVICE_FEATURE_DMA_LOGGING_START;
> +VFIODevice *vbasedev;
> +VFIOGroup *group;
> +int ret = 0;
> +
> +QLIST_FOREACH(group, &container->group_list, container_next) {
> +QLIST_FOREACH(vbasedev, &group->device_list, next) {
> +if (vbasedev->dirty_tracking == status) {
> +continue;
> +}
> +
> +ret = ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature);
> +if (ret) {
> +ret = -errno;
> +error_report("%s: Failed to set DMA logging %s, err %d (%s)",
> + vbasedev->name, status ? "start" : "stop", ret,
> + strerror(errno));
> +goto out;
> +}
> +vbasedev->dirty_tracking = status;
> +}
> +}
> +
> +out:
> +return ret;
> +}
> +
> +static int vfio_devices_dma_logging_stop(VFIOContainer *container)
> +{
> +uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature),
> +  sizeof(uint64_t))] = {};
> +struct vfio_device_feature *feature = (struct vfio_device_feature *)buf;
> +
> +feature->argsz = sizeof(buf);
> +feature->flags = VFIO_DEVICE_FEATURE_SET;
> +feature->flags |= VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP;
> +
> +return vfio_devices_dma_logging_set(container, feature);
> +}
> +
> +static gboolean vfio_device_dma_logging_range_add(DMAMap *map, gpointer data)
> +{
> +struct vfio_device_feature_dma_logging_range **out = data;
> +struct vfio_device_feature_dma_logging_range *range = *out;
> +
> +range->iova = map->iova;
> +/* IOVATree is inclusive, DMA logging uAPI isn't, so add 1 to length */
> +range->length = map->size + 1;
> +
> +*out = ++range;
> +
> +return false;
> +}
> +
> +static gboolean vfio_iova_tree_get_first(DMAMap *map, gpointer data)
> +{
> +DMAMap *first = data;
> +
> +first->iova = map->iova;
> +first->size = map->size;
> +
> +return true;
> +}
> +
> +static gboolean vfio_iova_tree_get_last(DMAMap *map, gpointer data)
> +{
> +DMAMap *last = data;
> +
> +last->iova = map->iova;
> +last->size = map->size;
> +
> +return false;
> +}
> +
> +static struct vfio_device_feature *
> +vfio_device_feature_dma_logging_start_create(VFIOContainer *container)
> +{
> +struct vfio_device_feature *feature;
> +size_t feature_size;
> +struct vfio_device_feature_dma_logging_control *control;
> +struct vfio_device_feature_dma_logging_range *ranges;
> +unsigned int max_ranges;
> +unsigned int cur_ranges;
> +
> +feature_size = sizeof(struct vfio_device_feature) +
> +   sizeof(struct vfio_device_featur

Re: [PATCH] tcg: Allow displaying TCG_TYPE_I128 arguments

2023-02-22 Thread Richard Henderson


On 2/22/23 11:28, Philippe Mathieu-Daudé wrote:

Signed-off-by: Philippe Mathieu-Daudé 
---
  tcg/tcg.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a4a3da6804..3df2c6a6af 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1955,6 +1955,7 @@ static char *tcg_get_arg_str_ptr(TCGContext *s, char 
*buf, int buf_size,
  break;
  #if TCG_TARGET_REG_BITS > 32
  case TCG_TYPE_I64:
+case TCG_TYPE_I128:
  snprintf(buf, buf_size, "$0x%" PRIx64, ts->val);


This would be for a 128-bit constant, which we don't have.
Is this a guess, or hitting the assert, or what?


r~

Re: [PATCH 0/5] Pegasos2 fixes and audio output support

2023-02-22 Thread BALATON Zoltan


On Wed, 22 Feb 2023, Bernhard Beschow wrote:

Am 22. Februar 2023 19:25:16 UTC schrieb BALATON Zoltan :

On Wed, 22 Feb 2023, Bernhard Beschow wrote:

On Wed, Feb 22, 2023 at 4:38 PM Bernhard Beschow  wrote:

On Tue, Feb 21, 2023 at 7:44 PM BALATON Zoltan  wrote:

This series fixes PCI interrupts on the ppc/pegasos2 machine and adds
partial implementation of the via-ac97 sound part enough to get audio
output. I'd like this to be merged for QEMU 8.0.

Regards,
BALATON Zoltan

BALATON Zoltan (5):
  hw/isa/vt82c686: Implement interrupt routing in via_isa_set_irq
  hw/isa/vt82c686: Implement PIRQ pins
  hw/ppc/pegasos2: Fix PCI interrupt routing
  hw/audio/ac97: Split off some definitions to a header
  hw/audio/via-ac97: Basic implementation of audio playback

 hw/audio/ac97.c|  43 +---
 hw/audio/ac97.h|  65 ++
 hw/audio/trace-events  |   6 +
 hw/audio/via-ac97.c| 436 -
 hw/ide/via.c   |   2 +-
 hw/isa/vt82c686.c  |  61 +-
 hw/pci-host/mv64361.c  |   4 -
 hw/ppc/pegasos2.c  |  26 ++-
 hw/usb/vt82c686-uhci-pci.c |   5 +-
 include/hw/isa/vt82c686.h  |  39 +++-
 10 files changed, 626 insertions(+), 61 deletions(-)
 create mode 100644 hw/audio/ac97.h

--
2.30.7



Wow, the MorphOS people paid attention to sound design. Thanks for
presenting it to us, Zoltan!

I've had a closer look at your series and I think it can be simplified:
Patch 2 can be implemented quite straight-forward like I proposed in a
private mail: https://github.com/shentok/qemu/commit/via-priq-routing.
Then, in order to make patch 3 "hw/ppc/pegasos2: Fix PCI interrupt routing"
working, one can expose the PCI interrupts with a single line like you do
in patch 2. With this, patch 1 "hw/isa/vt82c686: Implement interrupt
routing in via_isa_set_irq" isn't needed any longer and can be omitted.

In via-ac97, rather than using via_isa_set_irq(), pci_set_irq() can be
used instead. pci_set_irq() internally takes care of all the ISA interrupt
level tracking patch 1 attempted to address.



Here is a proof of concept branch to demonstrate that the simplification
actually works: https://github.com/shentok/qemu/commits/pegasos2 (Tested
with MorphOS with and without pegasos2.rom).


Does this only work because both the via-ac97 and the PCI interrupts are mapped 
to the same ISA IRQ and you've only tested sound? The guest could configure 
each device to use a different IRQ, also mapping them so they share one ISA 
interrupt. What happens if multiple devices are mapped to IRQ 9 (which is the 
case on pegasos2 where PCI cards, ac97 and USB all share this IRQ) and more 
than one such device wants to raise an interrupt at the same time? If you ack 
the ac97 interrupt but a PCI network card or the USB part still wants to get 
the CPUs attention the ISA IRQ should remain raised until all devices are 
serviced.


pci_bus_get_irq_level(), used in via_isa_set_pci_irq(), should handle
exactly that case very well.


I don't see a way to track the status of all devices in a single qemu_irq which 
can only be up or down so we need something to store the state of each source.


pci_set_irq() causes pci_bus_change_irq_level() to be called.
pci_bus_change_irq_level() tracks the sum of all irq levels of all
devices attached to a particular pin in irq_count. Have a look at
pci_bus_change_irq_level() and you will understand better.


My patch adds a state register to each ISA IRQ line for all possible sources 
which could probably be stored once but then for each change of ISA IRQ status 
all the mapped devices should be checked and combined so it's easier to store 
them for each IRQ. Does your approach still work if you play sound, and copy 
something from network to a USB device at the same time? (I'm not sure mine 
does not have remaining bugs but I don't think this can be simplified that way 
but if you can prove it would work I don't mind taking an alternative version 
but I'm not convinced yet.)


Well, I can't prove that my approach works but unfortunately I can
prove that both our approaches cause a freeze :/ Try:
1. Start `qemu-system-ppc -M pegasos2 -bios pegasos2.rom -rtc
base=localtime -device ati-vga,guest_hwcursor=true,romfile="" -cdrom
morphos-3.17.iso -device usb-mouse -device usb-kbd`
2. Move the mouse while sound is playing
-> Observe the VM to freeze


Not quite sure why but it seems to happen when both the ac97 and USB raise 
the interrupt and the guest driver seems to get confused. Adding some 
debug logging:


diff --git a/hw/isa/vt82c686.c b/hw/isa/vt82c686.c
index b16620daf8..f840e5a8d0 100644
--- a/hw/isa/vt82c686.c
+++ b/hw/isa/vt82c686.c
@@ -636,12 +636,13 @@ void via_isa_set_irq(PCIDevice *d, 
ViaISAIRQSourceBit n, int level)

 if (!isa_irq) {
 return;
 }
-
+if (n > 1) fprintf(stderr, "%s: %d %d %d %x -> ", __func__, n, level, isa_irq, 
s->isa_irq_state[isa_irq]);
 if (level) {
 s->isa_irq_state[isa_irq] |= BIT(n);

Re: [PATCH 0/5] Pegasos2 fixes and audio output support

2023-02-22 Thread Bernhard Beschow




Am 22. Februar 2023 21:12:01 UTC schrieb BALATON Zoltan :
>On Wed, 22 Feb 2023, Bernhard Beschow wrote:
>> Am 22. Februar 2023 19:25:16 UTC schrieb BALATON Zoltan :
>>> On Wed, 22 Feb 2023, Bernhard Beschow wrote:
 On Wed, Feb 22, 2023 at 4:38 PM Bernhard Beschow  wrote:
> On Tue, Feb 21, 2023 at 7:44 PM BALATON Zoltan  wrote:
>> This series fixes PCI interrupts on the ppc/pegasos2 machine and adds
>> partial implementation of the via-ac97 sound part enough to get audio
>> output. I'd like this to be merged for QEMU 8.0.
>> 
>> Regards,
>> BALATON Zoltan
>> 
>> BALATON Zoltan (5):
>>   hw/isa/vt82c686: Implement interrupt routing in via_isa_set_irq
>>   hw/isa/vt82c686: Implement PIRQ pins
>>   hw/ppc/pegasos2: Fix PCI interrupt routing
>>   hw/audio/ac97: Split off some definitions to a header
>>   hw/audio/via-ac97: Basic implementation of audio playback
>> 
>>  hw/audio/ac97.c|  43 +---
>>  hw/audio/ac97.h|  65 ++
>>  hw/audio/trace-events  |   6 +
>>  hw/audio/via-ac97.c| 436 -
>>  hw/ide/via.c   |   2 +-
>>  hw/isa/vt82c686.c  |  61 +-
>>  hw/pci-host/mv64361.c  |   4 -
>>  hw/ppc/pegasos2.c  |  26 ++-
>>  hw/usb/vt82c686-uhci-pci.c |   5 +-
>>  include/hw/isa/vt82c686.h  |  39 +++-
>>  10 files changed, 626 insertions(+), 61 deletions(-)
>>  create mode 100644 hw/audio/ac97.h
>> 
>> --
>> 2.30.7
>> 
>> 
> Wow, the MorphOS people paid attention to sound design. Thanks for
> presenting it to us, Zoltan!
> 
> I've had a closer look at your series and I think it can be simplified:
> Patch 2 can be implemented quite straight-forward like I proposed in a
> private mail: https://github.com/shentok/qemu/commit/via-priq-routing.
> Then, in order to make patch 3 "hw/ppc/pegasos2: Fix PCI interrupt 
> routing"
> working, one can expose the PCI interrupts with a single line like you do
> in patch 2. With this, patch 1 "hw/isa/vt82c686: Implement interrupt
> routing in via_isa_set_irq" isn't needed any longer and can be omitted.
> 
> In via-ac97, rather than using via_isa_set_irq(), pci_set_irq() can be
> used instead. pci_set_irq() internally takes care of all the ISA interrupt
> level tracking patch 1 attempted to address.
> 
 
 Here is a proof of concept branch to demonstrate that the simplification
 actually works: https://github.com/shentok/qemu/commits/pegasos2 (Tested
 with MorphOS with and without pegasos2.rom).
>>> 
>>> Does this only work because both the via-ac97 and the PCI interrupts are 
>>> mapped to the same ISA IRQ and you've only tested sound? The guest could 
>>> configure each device to use a different IRQ, also mapping them so they 
>>> share one ISA interrupt. What happens if multiple devices are mapped to IRQ 
>>> 9 (which is the case on pegasos2 where PCI cards, ac97 and USB all share 
>>> this IRQ) and more than one such device wants to raise an interrupt at the 
>>> same time? If you ack the ac97 interrupt but a PCI network card or the USB 
>>> part still wants to get the CPUs attention the ISA IRQ should remain raised 
>>> until all devices are serviced.
>> 
>> pci_bus_get_irq_level(), used in via_isa_set_pci_irq(), should handle
>> exactly that case very well.
>> 
>>> I don't see a way to track the status of all devices in a single qemu_irq 
>>> which can only be up or down so we need something to store the state of 
>>> each source.
>> 
>> pci_set_irq() causes pci_bus_change_irq_level() to be called.
>> pci_bus_change_irq_level() tracks the sum of all irq levels of all
>> devices attached to a particular pin in irq_count. Have a look at
>> pci_bus_change_irq_level() and you will understand better.
>
>I'm aware of that, we're using that in sam460ex which connects all PCI 
>interrupt lines to a single IRQ and Peter explored and explained it in a 
>comment there when that was discovered. First we had a patch with or-irq but 
>due to this behaviot that's not needed for PCI interrupts. But the VT8132 
>could change what ISA IRQ you route the sub functions to.

That depends on the sub function if you can do that. And if so, then it depends 
on whether the function is still in PCI mode (see below).

>It happens that on pegasos2 by default all of those are routed to IRQ9 except 
>IDE

All *PCI* interrupts are routed to IRQ9 while IDE legacy interrupts are routed 
to the compatible ISA IRQs. Note that the IDE function must only trigger the 
ISA IRQs if it is in legacy mode while it must only trigger the PCI IRQ in 
non-legacy mode. See https://www.bswd.com/pciide.pdf for more details on this 
particular topic.

>but what if a guest changes ac97 to use a different interrupt? Then it's not a 
>PCI interrupt any more so you can't use pci_set_irq in via=ac97.

How would it do

Re: [PATCH v2 7/7] target/arm: Add CPU properties for most v8.3 PAC features

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

Signed-off-by: Aaron Lindsay 
---
  target/arm/cpu.h   |  5 +++
  target/arm/cpu64.c | 81 ++
  2 files changed, 72 insertions(+), 14 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 9c3cbc9a29..40b4631f11 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1039,6 +1039,11 @@ struct ArchCPU {
   */
  bool prop_pauth;
  bool prop_pauth_impdef;
+bool prop_pauth_qarma3;
+bool prop_pauth_epac;
+bool prop_pauth2; // also known as EnhancedPAC2/EPAC2


No c++ comments.


+if (cpu->prop_pauth_epac &&
+(cpu->prop_pauth2 ||
+ cpu->prop_pauth_fpac ||
+ cpu->prop_pauth_fpac_combine)) {


Indentation.


+if (address_auth == 0)
+address_auth = 0b0001;


Missing braces.


+static Property arm_cpu_pauth2_property =
+DEFINE_PROP_BOOL("pauth2", ARMCPU, prop_pauth2, false);
+static Property arm_cpu_pauth_fpac_property =
+DEFINE_PROP_BOOL("pauth-fpac", ARMCPU, prop_pauth_fpac, false);
+static Property arm_cpu_pauth_fpac_combine_property =
+DEFINE_PROP_BOOL("pauth-fpac-combine", ARMCPU, prop_pauth_fpac_combine, 
false);


For -cpu max, I would expect these to default on.
Or perhaps not expose these or epac as properties at all.


@@ -646,6 +694,11 @@ static void aarch64_add_pauth_properties(Object *obj)
  cpu->prop_pauth = cpu_isar_feature(aa64_pauth, cpu);
  } else {
  qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_impdef_property);
+qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_qarma3_property);
+qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_epac_property);
+qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth2_property);
+qdev_property_add_static(DEVICE(obj), &arm_cpu_pauth_fpac_property);
+qdev_property_add_static(DEVICE(obj), 
&arm_cpu_pauth_fpac_combine_property);


I think the *only* property that makes sense for KVM is pauth=on/off, which controls if 
KVM exposes the key registers at all (and if off, APA/GPA/etc all get zeroed). There is 
certainly no way to adjust the algorithm exposed by the hardware.


The primary reason we have a property for pauth at all is speed of emulation.  When we 
first enabled qarma5, we saw a major slowdown, with pauth_computepac consuming nearly 50% 
of the entire runtime.  Later we added impdef, as a way of doing *some* testing of pauth 
without the extreme overhead of qarma5.


I see that qarma3 does about half the work of qarma5, so it would be interesting to 
measure the relative speed of the 3 implementations on a boot of kernel + selftests.


You may want to look a the code generated and play with flatten and noinline attributes 
around pauth_computepac and subroutines.  E.g.


static uint64_t __attribute__((flatten, noinline))
pauth_computepac_qarma5(uint64_t data, uint64_t modifier, ARMPACKey key)
{
return pauth_computepac_architected(data, modifier, key, false);
}

static uint64_t __attribute__((flatten, noinline))
pauth_computepac_qarma3(uint64_t data, uint64_t modifier, ARMPACKey key)
{
return pauth_computepac_architected(data, modifier, key, true);
}

static uint64_t __attribute__((flatten, noinline))
pauth_computepac_impdef(uint64_t data, uint64_t modifier, ARMPACKey key)
{
return qemu_xxhash64_4(data, modifier, key.lo, key.hi);
}

static uint64_t pauth_computepac(CPUARMState *env, uint64_t data,
 uint64_t modifier, ARMPACKey key)
{
if (cpu_isar_feature(aa64_pauth_arch_qarma5, env_archcpu(env))) {
return pauth_computepac_qarma5(data, modifier, key);
} else if (cpu_isar_feature(aa64_pauth_arch_qarma3, env_archcpu(env))) {
return pauth_computepac_qarma3(data, modifier, key);
} else {
return pauth_computepac_impdef(data, modifier, key);
}
}


r~

Re: [PATCH v2 10/20] vfio/common: Record DMA mapped IOVA ranges

2023-02-22 Thread Alex Williamson

On Wed, 22 Feb 2023 19:49:05 +0200
Avihai Horon  wrote:

> From: Joao Martins 
> 
> According to the device DMA logging uAPI, IOVA ranges to be logged by
> the device must be provided all at once upon DMA logging start.
> 
> As preparation for the following patches which will add device dirty
> page tracking, keep a record of all DMA mapped IOVA ranges so later they
> can be used for DMA logging start.
> 
> Note that when vIOMMU is enabled DMA mapped IOVA ranges are not tracked.
> This is due to the dynamic nature of vIOMMU DMA mapping/unmapping.
> Following patches will address the vIOMMU case specifically.
> 
> Signed-off-by: Joao Martins 
> Signed-off-by: Avihai Horon 
> ---
>  include/hw/vfio/vfio-common.h |  3 ++
>  hw/vfio/common.c  | 86 +--
>  2 files changed, 86 insertions(+), 3 deletions(-)
> 
> diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> index ee55d442b4..6f36876ce0 100644
> --- a/include/hw/vfio/vfio-common.h
> +++ b/include/hw/vfio/vfio-common.h
> @@ -23,6 +23,7 @@
>  
>  #include "exec/memory.h"
>  #include "qemu/queue.h"
> +#include "qemu/iova-tree.h"
>  #include "qemu/notify.h"
>  #include "ui/console.h"
>  #include "hw/display/ramfb.h"
> @@ -92,6 +93,8 @@ typedef struct VFIOContainer {
>  uint64_t max_dirty_bitmap_size;
>  unsigned long pgsizes;
>  unsigned int dma_max_mappings;
> +IOVATree *mappings;
> +QemuMutex mappings_mutex;
>  QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
>  QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list;
>  QLIST_HEAD(, VFIOGroup) group_list;
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 84f08bdbbb..6041da6c7e 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -44,6 +44,7 @@
>  #include "migration/blocker.h"
>  #include "migration/qemu-file.h"
>  #include "sysemu/tpm.h"
> +#include "qemu/iova-tree.h"
>  
>  VFIOGroupList vfio_group_list =
>  QLIST_HEAD_INITIALIZER(vfio_group_list);
> @@ -426,6 +427,11 @@ void vfio_unblock_multiple_devices_migration(void)
>  multiple_devices_migration_blocker = NULL;
>  }
>  
> +static bool vfio_have_giommu(VFIOContainer *container)
> +{
> +return !QLIST_EMPTY(&container->giommu_list);
> +}
> +
>  static void vfio_set_migration_error(int err)
>  {
>  MigrationState *ms = migrate_get_current();
> @@ -499,6 +505,51 @@ static bool 
> vfio_devices_all_running_and_mig_active(VFIOContainer *container)
>  return true;
>  }
>  
> +static int vfio_record_mapping(VFIOContainer *container, hwaddr iova,
> +   hwaddr size, bool readonly)
> +{
> +DMAMap map = {
> +.iova = iova,
> +.size = size - 1, /* IOVATree is inclusive, so subtract 1 from size 
> */
> +.perm = readonly ? IOMMU_RO : IOMMU_RW,
> +};
> +int ret;
> +
> +if (vfio_have_giommu(container)) {
> +return 0;
> +}
> +
> +WITH_QEMU_LOCK_GUARD(&container->mappings_mutex) {
> +ret = iova_tree_insert(container->mappings, &map);
> +if (ret) {
> +if (ret == IOVA_ERR_INVALID) {
> +ret = -EINVAL;
> +} else if (ret == IOVA_ERR_OVERLAP) {
> +ret = -EEXIST;
> +}
> +}
> +}
> +
> +return ret;
> +}
> +
> +static void vfio_erase_mapping(VFIOContainer *container, hwaddr iova,
> +hwaddr size)
> +{
> +DMAMap map = {
> +.iova = iova,
> +.size = size - 1, /* IOVATree is inclusive, so subtract 1 from size 
> */
> +};
> +
> +if (vfio_have_giommu(container)) {
> +return;
> +}
> +
> +WITH_QEMU_LOCK_GUARD(&container->mappings_mutex) {
> +iova_tree_remove(container->mappings, map);
> +}
> +}

Nit, 'insert' and 'remove' to match the IOVATree semantics?

>  static int vfio_dma_unmap_bitmap(VFIOContainer *container,
>   hwaddr iova, ram_addr_t size,
>   IOMMUTLBEntry *iotlb)
> @@ -599,6 +650,8 @@ static int vfio_dma_unmap(VFIOContainer *container,
>  DIRTY_CLIENTS_NOCODE);
>  }
>  
> +vfio_erase_mapping(container, iova, size);
> +
>  return 0;
>  }
>  
> @@ -612,6 +665,16 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr 
> iova,
>  .iova = iova,
>  .size = size,
>  };
> +int ret;
> +
> +ret = vfio_record_mapping(container, iova, size, readonly);
> +if (ret) {
> +error_report("vfio: Failed to record mapping, iova: 0x%" HWADDR_PRIx
> + ", size: 0x" RAM_ADDR_FMT ", ret: %d (%s)",
> + iova, size, ret, strerror(-ret));
> +
> +return ret;
> +}

Is there no way to replay the mappings when a migration is started?
This seems like a horrible latency and bloat trade-off for the
possibility that the VM might migrate and the device might support
these features.  Our performance with vIOMMU is already t

Re: [PATCH v5 21/29] hw/net/net_tx_pkt: Automatically determine if virtio-net header is used

2023-02-22 Thread Akihiko Odaki


On 2023/02/21 12:38, Jason Wang wrote:


在 2023/2/1 11:35, Akihiko Odaki 写道:

The new function qemu_get_using_vnet_hdr() allows to automatically
determine if virtio-net header is used.

Signed-off-by: Akihiko Odaki 
---
  hw/net/e1000e_core.c |  3 +--
  hw/net/net_tx_pkt.c  | 19 ++-
  hw/net/net_tx_pkt.h  |  3 +--
  hw/net/vmxnet3.c |  6 ++
  4 files changed, 14 insertions(+), 17 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 38d374fba3..954a007151 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -3376,8 +3376,7 @@ e1000e_core_pci_realize(E1000ECore *core,
  qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
  for (i = 0; i < E1000E_NUM_QUEUES; i++) {
-    net_tx_pkt_init(&core->tx[i].tx_pkt, core->owner,
-    E1000E_MAX_TX_FRAGS, core->has_vnet);
+    net_tx_pkt_init(&core->tx[i].tx_pkt, core->owner, 
E1000E_MAX_TX_FRAGS);

  }
  net_rx_pkt_init(&core->rx_pkt, core->has_vnet);
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 8a23899a4d..cf46c8457f 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -35,7 +35,6 @@ struct NetTxPkt {
  PCIDevice *pci_dev;
  struct virtio_net_hdr virt_hdr;
-    bool has_virt_hdr;



So this requires implicit coupling of NetTxPkt and a NetClientState (not 
self contained). This may work now but probably not the future e.g when 
two packets were queued in a list when one packet has a vnet header but 
another doesn't?


Thanks


This patch is actually intended to remove coupling of NetTxPkt and 
NetClientState. e1000e and igb have loopback mode, and in this mode, 
NetTxPkt needs to perform segmentation by itself even if the peer 
accepts vnet header. However, before this patch, has_virt_hdr flag is 
fixed in net_tx_pkt_init() so it couldn't handle a case where one packet 
needs vnet header and another doesn't.


This patch fixes such a case by deferring the decision whether to have 
vnet header (and to offload segmentation) to the point when actually 
sending the packet. This allows NetTxPkt to add a vnet header and not to 
do so, depending on the situation.


Patch "e1000e: Perform software segmentation for loopback" further 
decouples NetTxPkt and NetClientState by introducing a new function, 
net_tx_pkt_send_custom(). Unlike net_tx_pkt_send(), 
net_tx_pkt_send_custom() do not need NetClientState, and it is totally 
up to the caller whether to have vnet header or to offload segmentation.


Regards,
Akihiko Odaki





  struct iovec *raw;
  uint32_t raw_frags;
@@ -59,7 +58,7 @@ struct NetTxPkt {
  };
  void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
-    uint32_t max_frags, bool has_virt_hdr)
+    uint32_t max_frags)
  {
  struct NetTxPkt *p = g_malloc0(sizeof *p);
@@ -71,10 +70,8 @@ void net_tx_pkt_init(struct NetTxPkt **pkt, 
PCIDevice *pci_dev,

  p->max_payload_frags = max_frags;
  p->max_raw_frags = max_frags;
-    p->has_virt_hdr = has_virt_hdr;
  p->vec[NET_TX_PKT_VHDR_FRAG].iov_base = &p->virt_hdr;
-    p->vec[NET_TX_PKT_VHDR_FRAG].iov_len =
-    p->has_virt_hdr ? sizeof p->virt_hdr : 0;
+    p->vec[NET_TX_PKT_VHDR_FRAG].iov_len = sizeof p->virt_hdr;
  p->vec[NET_TX_PKT_L2HDR_FRAG].iov_base = &p->l2_hdr;
  p->vec[NET_TX_PKT_L3HDR_FRAG].iov_base = &p->l3_hdr;
@@ -617,9 +614,11 @@ static bool net_tx_pkt_do_sw_fragmentation(struct 
NetTxPkt *pkt,

  bool net_tx_pkt_send(struct NetTxPkt *pkt, NetClientState *nc)
  {
+    bool using_vnet_hdr = qemu_get_using_vnet_hdr(nc->peer);
+
  assert(pkt);
-    if (!pkt->has_virt_hdr &&
+    if (!using_vnet_hdr &&
  pkt->virt_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
  net_tx_pkt_do_sw_csum(pkt);
  }
@@ -636,11 +635,13 @@ bool net_tx_pkt_send(struct NetTxPkt *pkt, 
NetClientState *nc)

  }
  }
-    if (pkt->has_virt_hdr ||
+    if (using_vnet_hdr ||
  pkt->virt_hdr.gso_type == VIRTIO_NET_HDR_GSO_NONE) {
+    int index = using_vnet_hdr ?
+    NET_TX_PKT_VHDR_FRAG : NET_TX_PKT_L2HDR_FRAG;
  net_tx_pkt_fix_ip6_payload_len(pkt);
-    net_tx_pkt_sendv(pkt, nc, pkt->vec,
-    pkt->payload_frags + NET_TX_PKT_PL_START_FRAG);
+    net_tx_pkt_sendv(pkt, nc, pkt->vec + index,
+    pkt->payload_frags + NET_TX_PKT_PL_START_FRAG - index);
  return true;
  }
diff --git a/hw/net/net_tx_pkt.h b/hw/net/net_tx_pkt.h
index 2e38a5fa69..8d3faa42fb 100644
--- a/hw/net/net_tx_pkt.h
+++ b/hw/net/net_tx_pkt.h
@@ -32,10 +32,9 @@ struct NetTxPkt;
   * @pkt:    packet pointer
   * @pci_dev:    PCI device processing this packet
   * @max_frags:  max tx ip fragments
- * @has_virt_hdr:   device uses virtio header.
   */
  void net_tx_pkt_init(struct NetTxPkt **pkt, PCIDevice *pci_dev,
-    uint32_t max_frags, bool has_virt_hdr);
+    uint32_t max_frags);
  /**
   * Clean all tx packet resources.
diff --git a/hw/net/vmxnet3.c b/hw/ne

Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM

2023-02-22 Thread Sean Christopherson

On Thu, Feb 16, 2023, David Hildenbrand wrote:
> On 16.02.23 06:13, Mike Rapoport wrote:
> > Hi,
> > 
> > On Fri, Dec 02, 2022 at 02:13:38PM +0800, Chao Peng wrote:
> > > This patch series implements KVM guest private memory for confidential
> > > computing scenarios like Intel TDX[1]. If a TDX host accesses
> > > TDX-protected guest memory, machine check can happen which can further
> > > crash the running host system, this is terrible for multi-tenant
> > > configurations. The host accesses include those from KVM userspace like
> > > QEMU. This series addresses KVM userspace induced crash by introducing
> > > new mm and KVM interfaces so KVM userspace can still manage guest memory
> > > via a fd-based approach, but it can never access the guest memory
> > > content.
> > 
> > Sorry for jumping late.
> > 
> > Unless I'm missing something, hibernation will also cause an machine check
> > when there is TDX-protected memory in the system. When the hibernation
> > creates memory snapshot it essentially walks all physical pages and saves
> > their contents, so for TDX memory this will trigger machine check, right?

For hibernation specifically, I think that should be handled elsewhere as 
hibernation
is simply incompatible with TDX, SNP, pKVM, etc. without paravirtualizing the
guest, as none of those technologies support auto-export a la s390.  I suspect
the right approach is to disallow hibernation if KVM is running any protected 
guests.

> I recall bringing that up in the past (also memory access due to kdump,
> /prov/kcore) and was told that the main focus for now is preventing
> unprivileged users from crashing the system, that is, not mapping such
> memory into user space (e.g., QEMU). In the long run, we'll want to handle
> such pages also properly in the other events where the kernel might access
> them.

Ya, unless someone strongly objects, the plan is to essentially treat "attacks"
from privileged users as out of to scope for initial support, and then iterate
as needed to fix/enable more features.

FWIW, read accesses, e.g. kdump, should be ok for TDX and SNP as they both play
nice with "bad" reads.  pKVM is a different beast though as I believe any access
to guest private memory will fault.  But my understanding is that this series
would be a big step forward for pKVM, which currently doesn't have any 
safeguards.

Re: [PATCH v2 07/20] vfio/common: Add VFIOBitmap and (de)alloc functions

2023-02-22 Thread Alex Williamson

On Wed, 22 Feb 2023 19:49:02 +0200
Avihai Horon  wrote:

> There are already two places where dirty page bitmap allocation and
> calculations are done in open code. With device dirty page tracking
> being added in next patches, there are going to be even more places.
> 
> To avoid code duplication, introduce VFIOBitmap struct and corresponding
> alloc and dealloc functions and use them where applicable.
> 
> Signed-off-by: Avihai Horon 
> ---
>  hw/vfio/common.c | 89 
>  1 file changed, 60 insertions(+), 29 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ac93b85632..84f08bdbbb 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -320,6 +320,41 @@ const MemoryRegionOps vfio_region_ops = {
>   * Device state interfaces
>   */
>  
> +typedef struct {
> +unsigned long *bitmap;
> +hwaddr size;
> +hwaddr pages;
> +} VFIOBitmap;
> +
> +static VFIOBitmap *vfio_bitmap_alloc(hwaddr size)
> +{
> +VFIOBitmap *vbmap = g_try_new0(VFIOBitmap, 1);
> +if (!vbmap) {
> +errno = ENOMEM;
> +
> +return NULL;
> +}
> +
> +vbmap->pages = REAL_HOST_PAGE_ALIGN(size) / qemu_real_host_page_size();
> +vbmap->size = ROUND_UP(vbmap->pages, sizeof(__u64) * BITS_PER_BYTE) /
> + BITS_PER_BYTE;
> +vbmap->bitmap = g_try_malloc0(vbmap->size);
> +if (!vbmap->bitmap) {
> +g_free(vbmap);
> +errno = ENOMEM;
> +
> +return NULL;
> +}
> +
> +return vbmap;
> +}
> +
> +static void vfio_bitmap_dealloc(VFIOBitmap *vbmap)
> +{
> +g_free(vbmap->bitmap);
> +g_free(vbmap);
> +}

Nit, '_alloc' and '_free' seems like a more standard convention.
Thanks,

Alex

Re: [PATCH v2 6/7] target/arm: Implement v8.3 FPAC and FPACCOMBINE

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

+static G_NORETURN
+void pauth_fail_exception(CPUARMState *env, bool data, int keynumber, 
uintptr_t ra)
+{
+int target_el = arm_current_el(env);
+if (target_el == 0) {
+uint64_t hcr = arm_hcr_el2_eff(env);
+if (arm_is_el2_enabled(env) && (hcr & HCR_TGE))
+target_el = 2;
+else
+target_el = 1;
+}
+
+raise_exception_ra(env, EXCP_UDEF, syn_pacfail(data, keynumber), 
target_el, ra);


Use exception_target_el(), no need to check TGE here.


@@ -406,6 +421,16 @@ static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, 
uint64_t modifier,
  uint64_t xor_mask = MAKE_64BIT_MASK(bot_bit, top_bit - bot_bit + 1) &
  ~MAKE_64BIT_MASK(55, 1);
  result = ((ptr ^ pac) & xor_mask) | (ptr & ~xor_mask);
+if (cpu_isar_feature(aa64_fpac_combine, env_archcpu(env)) ||
+(cpu_isar_feature(aa64_fpac, env_archcpu(env)) &&
+ !is_combined)) {


Indentation is off.


+int error_code = ((data ? 1 : 0) << 1) | (keynumber);


'? 1 : 0' is not required.


r~

Re: [PATCH 0/3] hw/acpi/cpu_hotplug: Convert 'Object device' -> 'DeviceState parent'

2023-02-22 Thread Philippe Mathieu-Daudé


On 3/2/23 17:30, Philippe Mathieu-Daudé wrote:

To ease code review, rename ACPI CPU hotplug variables
to more meaningful names.

Since hotplug parent can't be any QOM object, and must be
a QDev, convert AcpiCpuHotplug::device from Object* to
DeviceState*.

Philippe Mathieu-Daudé (3):
   hw/acpi/cpu_hotplug: Rename gpe_cpu -> gpe
   hw/acpi/cpu_hotplug: Rename 'parent' MemoryRegion as 'container'
   hw/acpi/cpu_hotplug: Convert 'Object *device' -> 'DeviceState *parent'


ping

[PATCH] tcg: Allow displaying TCG_TYPE_I128 arguments

2023-02-22 Thread Philippe Mathieu-Daudé

Signed-off-by: Philippe Mathieu-Daudé 
---
 tcg/tcg.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index a4a3da6804..3df2c6a6af 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -1955,6 +1955,7 @@ static char *tcg_get_arg_str_ptr(TCGContext *s, char 
*buf, int buf_size,
 break;
 #if TCG_TARGET_REG_BITS > 32
 case TCG_TYPE_I64:
+case TCG_TYPE_I128:
 snprintf(buf, buf_size, "$0x%" PRIx64, ts->val);
 break;
 #endif
-- 
2.38.1

[PATCH v2 2/3] contrib/elf2dmp: move PE dir search to pe_get_data_dir_entry

2023-02-22 Thread Viktor Prutyanov

Move out PE directory search functionality to be reused not only
for Debug Directory processing but for arbitrary PE directory.

Signed-off-by: Viktor Prutyanov 
---
 contrib/elf2dmp/main.c | 71 +-
 1 file changed, 42 insertions(+), 29 deletions(-)

diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index 9224764239..2f6028d8eb 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -333,6 +333,45 @@ static int fill_context(KDDEBUGGER_DATA64 *kdbg,
 return 0;
 }
 
+static int pe_get_data_dir_entry(uint64_t base, void *start_addr, int idx,
+void *entry, size_t size, struct va_space *vs)
+{
+const char e_magic[2] = "MZ";
+const char Signature[4] = "PE\0\0";
+IMAGE_DOS_HEADER *dos_hdr = start_addr;
+IMAGE_NT_HEADERS64 nt_hdrs;
+IMAGE_FILE_HEADER *file_hdr = &nt_hdrs.FileHeader;
+IMAGE_OPTIONAL_HEADER64 *opt_hdr = &nt_hdrs.OptionalHeader;
+IMAGE_DATA_DIRECTORY *data_dir = nt_hdrs.OptionalHeader.DataDirectory;
+
+QEMU_BUILD_BUG_ON(sizeof(*dos_hdr) >= ELF2DMP_PAGE_SIZE);
+
+if (memcmp(&dos_hdr->e_magic, e_magic, sizeof(e_magic))) {
+return 1;
+}
+
+if (va_space_rw(vs, base + dos_hdr->e_lfanew,
+&nt_hdrs, sizeof(nt_hdrs), 0)) {
+return 1;
+}
+
+if (memcmp(&nt_hdrs.Signature, Signature, sizeof(Signature)) ||
+file_hdr->Machine != 0x8664 || opt_hdr->Magic != 0x020b) {
+return 1;
+}
+
+if (va_space_rw(vs,
+base + data_dir[idx].VirtualAddress,
+entry, size, 0)) {
+return 1;
+}
+
+printf("Data directory entry #%d: RVA = 0x%08"PRIx32"\n", idx,
+(uint32_t)data_dir[idx].VirtualAddress);
+
+return 0;
+}
+
 static int write_dump(struct pa_space *ps,
 WinDumpHeader64 *hdr, const char *name)
 {
@@ -369,42 +408,16 @@ static int write_dump(struct pa_space *ps,
 static int pe_get_pdb_symstore_hash(uint64_t base, void *start_addr,
 char *hash, struct va_space *vs)
 {
-const char e_magic[2] = "MZ";
-const char Signature[4] = "PE\0\0";
 const char sign_rsds[4] = "RSDS";
-IMAGE_DOS_HEADER *dos_hdr = start_addr;
-IMAGE_NT_HEADERS64 nt_hdrs;
-IMAGE_FILE_HEADER *file_hdr = &nt_hdrs.FileHeader;
-IMAGE_OPTIONAL_HEADER64 *opt_hdr = &nt_hdrs.OptionalHeader;
-IMAGE_DATA_DIRECTORY *data_dir = nt_hdrs.OptionalHeader.DataDirectory;
 IMAGE_DEBUG_DIRECTORY debug_dir;
 OMFSignatureRSDS rsds;
 char *pdb_name;
 size_t pdb_name_sz;
 size_t i;
 
-QEMU_BUILD_BUG_ON(sizeof(*dos_hdr) >= ELF2DMP_PAGE_SIZE);
-
-if (memcmp(&dos_hdr->e_magic, e_magic, sizeof(e_magic))) {
-return 1;
-}
-
-if (va_space_rw(vs, base + dos_hdr->e_lfanew,
-&nt_hdrs, sizeof(nt_hdrs), 0)) {
-return 1;
-}
-
-if (memcmp(&nt_hdrs.Signature, Signature, sizeof(Signature)) ||
-file_hdr->Machine != 0x8664 || opt_hdr->Magic != 0x020b) {
-return 1;
-}
-
-printf("Debug Directory RVA = 0x%08"PRIx32"\n",
-(uint32_t)data_dir[IMAGE_FILE_DEBUG_DIRECTORY].VirtualAddress);
-
-if (va_space_rw(vs,
-base + data_dir[IMAGE_FILE_DEBUG_DIRECTORY].VirtualAddress,
-&debug_dir, sizeof(debug_dir), 0)) {
+if (pe_get_data_dir_entry(base, start_addr, IMAGE_FILE_DEBUG_DIRECTORY,
+&debug_dir, sizeof(debug_dir), vs)) {
+eprintf("Failed to get Debug Directory\n");
 return 1;
 }
 
-- 
2.35.1

[PATCH v2 3/3] contrib/elf2dmp: add PE name check and Windows Server 2022 support

2023-02-22 Thread Viktor Prutyanov

Since its inception elf2dmp has checked MZ signatures within an
address space above IDT[0] interrupt vector and took first PE image
found as Windows Kernel.
But in Windows Server 2022 memory dump this address space range is
full of invalid PE fragments and the tool must check that PE image
is 'ntoskrnl.exe' actually.
So, introduce additional validation by checking image name from
Export Directory against 'ntoskrnl.exe'.

Signed-off-by: Viktor Prutyanov 
Tested-by: Yuri Benditovich 
---
 contrib/elf2dmp/main.c | 28 ++--
 contrib/elf2dmp/pe.h   | 15 +++
 2 files changed, 41 insertions(+), 2 deletions(-)

diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index 2f6028d8eb..89f0c69ab0 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -17,6 +17,7 @@
 
 #define SYM_URL_BASE"https://msdl.microsoft.com/download/symbols/";
 #define PDB_NAME"ntkrnlmp.pdb"
+#define PE_NAME "ntoskrnl.exe"
 
 #define INITIAL_MXCSR   0x1f80
 
@@ -405,6 +406,25 @@ static int write_dump(struct pa_space *ps,
 return fclose(dmp_file);
 }
 
+static bool pe_check_export_name(uint64_t base, void *start_addr,
+struct va_space *vs)
+{
+IMAGE_EXPORT_DIRECTORY export_dir;
+const char *pe_name;
+
+if (pe_get_data_dir_entry(base, start_addr, IMAGE_FILE_EXPORT_DIRECTORY,
+&export_dir, sizeof(export_dir), vs)) {
+return false;
+}
+
+pe_name = va_space_resolve(vs, base + export_dir.Name);
+if (!pe_name) {
+return false;
+}
+
+return !strcmp(pe_name, PE_NAME);
+}
+
 static int pe_get_pdb_symstore_hash(uint64_t base, void *start_addr,
 char *hash, struct va_space *vs)
 {
@@ -489,6 +509,7 @@ int main(int argc, char *argv[])
 uint64_t KdDebuggerDataBlock;
 KDDEBUGGER_DATA64 *kdbg;
 uint64_t KdVersionBlock;
+bool kernel_found = false;
 
 if (argc != 3) {
 eprintf("usage:\n\t%s elf_file dmp_file\n", argv[0]);
@@ -536,11 +557,14 @@ int main(int argc, char *argv[])
 }
 
 if (*(uint16_t *)nt_start_addr == 0x5a4d) { /* MZ */
-break;
+if (pe_check_export_name(KernBase, nt_start_addr, &vs)) {
+kernel_found = true;
+break;
+}
 }
 }
 
-if (!nt_start_addr) {
+if (!kernel_found) {
 eprintf("Failed to find NT kernel image\n");
 err = 1;
 goto out_ps;
diff --git a/contrib/elf2dmp/pe.h b/contrib/elf2dmp/pe.h
index 807d006364..71126af1ac 100644
--- a/contrib/elf2dmp/pe.h
+++ b/contrib/elf2dmp/pe.h
@@ -88,6 +88,20 @@ typedef struct IMAGE_NT_HEADERS64 {
 IMAGE_OPTIONAL_HEADER64 OptionalHeader;
 } __attribute__ ((packed)) IMAGE_NT_HEADERS64;
 
+typedef struct IMAGE_EXPORT_DIRECTORY {
+uint32_tCharacteristics;
+uint32_tTimeDateStamp;
+uint16_tMajorVersion;
+uint16_tMinorVersion;
+uint32_tName;
+uint32_tBase;
+uint32_tNumberOfFunctions;
+uint32_tNumberOfNames;
+uint32_tAddressOfFunctions;
+uint32_tAddressOfNames;
+uint32_tAddressOfNameOrdinals;
+} __attribute__ ((packed)) IMAGE_EXPORT_DIRECTORY;
+
 typedef struct IMAGE_DEBUG_DIRECTORY {
 uint32_t Characteristics;
 uint32_t TimeDateStamp;
@@ -102,6 +116,7 @@ typedef struct IMAGE_DEBUG_DIRECTORY {
 #define IMAGE_DEBUG_TYPE_CODEVIEW   2
 #endif
 
+#define IMAGE_FILE_EXPORT_DIRECTORY 0
 #define IMAGE_FILE_DEBUG_DIRECTORY  6
 
 typedef struct guid_t {
-- 
2.35.1

[PATCH v2 1/3] contrib/elf2dmp: fix code style

2023-02-22 Thread Viktor Prutyanov

Originally elf2dmp were added with some code style issues,
especially in pe.h header, and some were introduced by
2d0fc797faaa73fbc1d30f5f9e90407bf3dd93f0. Fix them now.

Signed-off-by: Viktor Prutyanov 
---
 contrib/elf2dmp/addrspace.c |   1 +
 contrib/elf2dmp/main.c  |   9 ++--
 contrib/elf2dmp/pe.h| 100 ++--
 3 files changed, 57 insertions(+), 53 deletions(-)

diff --git a/contrib/elf2dmp/addrspace.c b/contrib/elf2dmp/addrspace.c
index 53ded17061..0b04cba00e 100644
--- a/contrib/elf2dmp/addrspace.c
+++ b/contrib/elf2dmp/addrspace.c
@@ -11,6 +11,7 @@
 static struct pa_block *pa_space_find_block(struct pa_space *ps, uint64_t pa)
 {
 size_t i;
+
 for (i = 0; i < ps->block_nr; i++) {
 if (ps->block[i].paddr <= pa &&
 pa <= ps->block[i].paddr + ps->block[i].size) {
diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
index d77b8f98f7..9224764239 100644
--- a/contrib/elf2dmp/main.c
+++ b/contrib/elf2dmp/main.c
@@ -282,14 +282,16 @@ static int fill_header(WinDumpHeader64 *hdr, struct 
pa_space *ps,
 };
 
 for (i = 0; i < ps->block_nr; i++) {
-h.PhysicalMemoryBlock.NumberOfPages += ps->block[i].size / 
ELF2DMP_PAGE_SIZE;
+h.PhysicalMemoryBlock.NumberOfPages +=
+ps->block[i].size / ELF2DMP_PAGE_SIZE;
 h.PhysicalMemoryBlock.Run[i] = (WinDumpPhyMemRun64) {
 .BasePage = ps->block[i].paddr / ELF2DMP_PAGE_SIZE,
 .PageCount = ps->block[i].size / ELF2DMP_PAGE_SIZE,
 };
 }
 
-h.RequiredDumpSpace += h.PhysicalMemoryBlock.NumberOfPages << 
ELF2DMP_PAGE_BITS;
+h.RequiredDumpSpace +=
+h.PhysicalMemoryBlock.NumberOfPages << ELF2DMP_PAGE_BITS;
 
 *hdr = h;
 
@@ -299,7 +301,8 @@ static int fill_header(WinDumpHeader64 *hdr, struct 
pa_space *ps,
 static int fill_context(KDDEBUGGER_DATA64 *kdbg,
 struct va_space *vs, QEMU_Elf *qe)
 {
-int i;
+int i;
+
 for (i = 0; i < qe->state_nr; i++) {
 uint64_t Prcb;
 uint64_t Context;
diff --git a/contrib/elf2dmp/pe.h b/contrib/elf2dmp/pe.h
index c2a4a6ba7c..807d006364 100644
--- a/contrib/elf2dmp/pe.h
+++ b/contrib/elf2dmp/pe.h
@@ -33,70 +33,70 @@ typedef struct IMAGE_DOS_HEADER {
 } __attribute__ ((packed)) IMAGE_DOS_HEADER;
 
 typedef struct IMAGE_FILE_HEADER {
-  uint16_t  Machine;
-  uint16_t  NumberOfSections;
-  uint32_t  TimeDateStamp;
-  uint32_t  PointerToSymbolTable;
-  uint32_t  NumberOfSymbols;
-  uint16_t  SizeOfOptionalHeader;
-  uint16_t  Characteristics;
+uint16_t  Machine;
+uint16_t  NumberOfSections;
+uint32_t  TimeDateStamp;
+uint32_t  PointerToSymbolTable;
+uint32_t  NumberOfSymbols;
+uint16_t  SizeOfOptionalHeader;
+uint16_t  Characteristics;
 } __attribute__ ((packed)) IMAGE_FILE_HEADER;
 
 typedef struct IMAGE_DATA_DIRECTORY {
-  uint32_t VirtualAddress;
-  uint32_t Size;
+uint32_t VirtualAddress;
+uint32_t Size;
 } __attribute__ ((packed)) IMAGE_DATA_DIRECTORY;
 
 #define IMAGE_NUMBEROF_DIRECTORY_ENTRIES 16
 
 typedef struct IMAGE_OPTIONAL_HEADER64 {
-  uint16_t  Magic; /* 0x20b */
-  uint8_t   MajorLinkerVersion;
-  uint8_t   MinorLinkerVersion;
-  uint32_t  SizeOfCode;
-  uint32_t  SizeOfInitializedData;
-  uint32_t  SizeOfUninitializedData;
-  uint32_t  AddressOfEntryPoint;
-  uint32_t  BaseOfCode;
-  uint64_t  ImageBase;
-  uint32_t  SectionAlignment;
-  uint32_t  FileAlignment;
-  uint16_t  MajorOperatingSystemVersion;
-  uint16_t  MinorOperatingSystemVersion;
-  uint16_t  MajorImageVersion;
-  uint16_t  MinorImageVersion;
-  uint16_t  MajorSubsystemVersion;
-  uint16_t  MinorSubsystemVersion;
-  uint32_t  Win32VersionValue;
-  uint32_t  SizeOfImage;
-  uint32_t  SizeOfHeaders;
-  uint32_t  CheckSum;
-  uint16_t  Subsystem;
-  uint16_t  DllCharacteristics;
-  uint64_t  SizeOfStackReserve;
-  uint64_t  SizeOfStackCommit;
-  uint64_t  SizeOfHeapReserve;
-  uint64_t  SizeOfHeapCommit;
-  uint32_t  LoaderFlags;
-  uint32_t  NumberOfRvaAndSizes;
-  IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
+uint16_t  Magic; /* 0x20b */
+uint8_t   MajorLinkerVersion;
+uint8_t   MinorLinkerVersion;
+uint32_t  SizeOfCode;
+uint32_t  SizeOfInitializedData;
+uint32_t  SizeOfUninitializedData;
+uint32_t  AddressOfEntryPoint;
+uint32_t  BaseOfCode;
+uint64_t  ImageBase;
+uint32_t  SectionAlignment;
+uint32_t  FileAlignment;
+uint16_t  MajorOperatingSystemVersion;
+uint16_t  MinorOperatingSystemVersion;
+uint16_t  MajorImageVersion;
+uint16_t  MinorImageVersion;
+uint16_t  MajorSubsystemVersion;
+uint16_t  MinorSubsystemVersion;
+uint32_t  Win32VersionValue;
+uint32_t  SizeOfImage;
+uint32_t  SizeOfHeaders;
+uint32_t  CheckSum;
+uint16_t  Subsystem;
+uint16_t  DllCharacteristics;
+uint64_t  SizeOfStackReserve;
+uint64_t  SizeOfStackCommit;
+uint64_t  SizeOfHeapReserve;
+uint64_t  SizeOfHeapComm

[PATCH v2 0/3] contrib/elf2dmp: Windows Server 2022 support

2023-02-22 Thread Viktor Prutyanov

Hi,

For now, elf2dmp is unable to convert ELF-dump to DMP-dump made of
Windows Server 2022 guest. This patch series fixes it.

v1: improve code-style fix
v2: don't remove data directory entry RVA print and DOS header size check

Viktor Prutyanov (3):
  contrib/elf2dmp: fix code style
  contrib/elf2dmp: move PE dir search to pe_get_data_dir_entry
  contrib/elf2dmp: add PE name check and Windows Server 2022 support

 contrib/elf2dmp/addrspace.c |   1 +
 contrib/elf2dmp/main.c  | 108 ++---
 contrib/elf2dmp/pe.h| 115 
 3 files changed, 140 insertions(+), 84 deletions(-)

-- 
2.35.1

Re: [PATCH 0/5] Pegasos2 fixes and audio output support

2023-02-22 Thread BALATON Zoltan


On Wed, 22 Feb 2023, Bernhard Beschow wrote:

Am 22. Februar 2023 19:25:16 UTC schrieb BALATON Zoltan :

On Wed, 22 Feb 2023, Bernhard Beschow wrote:

On Wed, Feb 22, 2023 at 4:38 PM Bernhard Beschow  wrote:

On Tue, Feb 21, 2023 at 7:44 PM BALATON Zoltan  wrote:

This series fixes PCI interrupts on the ppc/pegasos2 machine and adds
partial implementation of the via-ac97 sound part enough to get audio
output. I'd like this to be merged for QEMU 8.0.

Regards,
BALATON Zoltan

BALATON Zoltan (5):
  hw/isa/vt82c686: Implement interrupt routing in via_isa_set_irq
  hw/isa/vt82c686: Implement PIRQ pins
  hw/ppc/pegasos2: Fix PCI interrupt routing
  hw/audio/ac97: Split off some definitions to a header
  hw/audio/via-ac97: Basic implementation of audio playback

 hw/audio/ac97.c|  43 +---
 hw/audio/ac97.h|  65 ++
 hw/audio/trace-events  |   6 +
 hw/audio/via-ac97.c| 436 -
 hw/ide/via.c   |   2 +-
 hw/isa/vt82c686.c  |  61 +-
 hw/pci-host/mv64361.c  |   4 -
 hw/ppc/pegasos2.c  |  26 ++-
 hw/usb/vt82c686-uhci-pci.c |   5 +-
 include/hw/isa/vt82c686.h  |  39 +++-
 10 files changed, 626 insertions(+), 61 deletions(-)
 create mode 100644 hw/audio/ac97.h

--
2.30.7



Wow, the MorphOS people paid attention to sound design. Thanks for
presenting it to us, Zoltan!

I've had a closer look at your series and I think it can be simplified:
Patch 2 can be implemented quite straight-forward like I proposed in a
private mail: https://github.com/shentok/qemu/commit/via-priq-routing.
Then, in order to make patch 3 "hw/ppc/pegasos2: Fix PCI interrupt routing"
working, one can expose the PCI interrupts with a single line like you do
in patch 2. With this, patch 1 "hw/isa/vt82c686: Implement interrupt
routing in via_isa_set_irq" isn't needed any longer and can be omitted.

In via-ac97, rather than using via_isa_set_irq(), pci_set_irq() can be
used instead. pci_set_irq() internally takes care of all the ISA interrupt
level tracking patch 1 attempted to address.



Here is a proof of concept branch to demonstrate that the simplification
actually works: https://github.com/shentok/qemu/commits/pegasos2 (Tested
with MorphOS with and without pegasos2.rom).


Does this only work because both the via-ac97 and the PCI interrupts are mapped 
to the same ISA IRQ and you've only tested sound? The guest could configure 
each device to use a different IRQ, also mapping them so they share one ISA 
interrupt. What happens if multiple devices are mapped to IRQ 9 (which is the 
case on pegasos2 where PCI cards, ac97 and USB all share this IRQ) and more 
than one such device wants to raise an interrupt at the same time? If you ack 
the ac97 interrupt but a PCI network card or the USB part still wants to get 
the CPUs attention the ISA IRQ should remain raised until all devices are 
serviced.


pci_bus_get_irq_level(), used in via_isa_set_pci_irq(), should handle
exactly that case very well.


I don't see a way to track the status of all devices in a single qemu_irq which 
can only be up or down so we need something to store the state of each source.


pci_set_irq() causes pci_bus_change_irq_level() to be called.
pci_bus_change_irq_level() tracks the sum of all irq levels of all
devices attached to a particular pin in irq_count. Have a look at
pci_bus_change_irq_level() and you will understand better.


I'm aware of that, we're using that in sam460ex which connects all PCI 
interrupt lines to a single IRQ and Peter explored and explained it in a 
comment there when that was discovered. First we had a patch with or-irq 
but due to this behaviot that's not needed for PCI interrupts. But the 
VT8132 could change what ISA IRQ you route the sub functions to. It 
happens that on pegasos2 by default all of those are routed to IRQ9 except 
IDE but what if a guest changes ac97 to use a different interrupt? Then 
it's not a PCI interrupt any more so you can't use pci_set_irq in 
via=ac97. There are only 4 PCI INT lines but the VIA components can be 
routed to 13 or 14 ISA IRQs. How do you keep track of that with only the 
PCI bus interrupts? I don't get your approach.



My patch adds a state register to each ISA IRQ line for all possible sources 
which could probably be stored once but then for each change of ISA IRQ status 
all the mapped devices should be checked and combined so it's easier to store 
them for each IRQ. Does your approach still work if you play sound, and copy 
something from network to a USB device at the same time? (I'm not sure mine 
does not have remaining bugs but I don't think this can be simplified that way 
but if you can prove it would work I don't mind taking an alternative version 
but I'm not convinced yet.)


Well, I can't prove that my approach works but unfortunately I can
prove that both our approaches cause a freeze :/ Try:
1. Start `qemu-system-ppc -M pegasos2 -bios pegasos2.rom -rtc
base=localtime -device ati-vga

Re: [PATCH 0/5] Pegasos2 fixes and audio output support

2023-02-22 Thread Bernhard Beschow

Am 22. Februar 2023 19:25:16 UTC schrieb BALATON Zoltan :
>On Wed, 22 Feb 2023, Bernhard Beschow wrote:
>> On Wed, Feb 22, 2023 at 4:38 PM Bernhard Beschow  wrote:
>>> On Tue, Feb 21, 2023 at 7:44 PM BALATON Zoltan  wrote:
 This series fixes PCI interrupts on the ppc/pegasos2 machine and adds
 partial implementation of the via-ac97 sound part enough to get audio
 output. I'd like this to be merged for QEMU 8.0.

 Regards,
 BALATON Zoltan

 BALATON Zoltan (5):
   hw/isa/vt82c686: Implement interrupt routing in via_isa_set_irq
   hw/isa/vt82c686: Implement PIRQ pins
   hw/ppc/pegasos2: Fix PCI interrupt routing
   hw/audio/ac97: Split off some definitions to a header
   hw/audio/via-ac97: Basic implementation of audio playback

  hw/audio/ac97.c|  43 +---
  hw/audio/ac97.h|  65 ++
  hw/audio/trace-events  |   6 +
  hw/audio/via-ac97.c| 436 -
  hw/ide/via.c   |   2 +-
  hw/isa/vt82c686.c  |  61 +-
  hw/pci-host/mv64361.c  |   4 -
  hw/ppc/pegasos2.c  |  26 ++-
  hw/usb/vt82c686-uhci-pci.c |   5 +-
  include/hw/isa/vt82c686.h  |  39 +++-
  10 files changed, 626 insertions(+), 61 deletions(-)
  create mode 100644 hw/audio/ac97.h

 --
 2.30.7


>>> Wow, the MorphOS people paid attention to sound design. Thanks for
>>> presenting it to us, Zoltan!
>>>
>>> I've had a closer look at your series and I think it can be simplified:
>>> Patch 2 can be implemented quite straight-forward like I proposed in a
>>> private mail: https://github.com/shentok/qemu/commit/via-priq-routing.
>>> Then, in order to make patch 3 "hw/ppc/pegasos2: Fix PCI interrupt routing"
>>> working, one can expose the PCI interrupts with a single line like you do
>>> in patch 2. With this, patch 1 "hw/isa/vt82c686: Implement interrupt
>>> routing in via_isa_set_irq" isn't needed any longer and can be omitted.
>>>
>>> In via-ac97, rather than using via_isa_set_irq(), pci_set_irq() can be
>>> used instead. pci_set_irq() internally takes care of all the ISA interrupt
>>> level tracking patch 1 attempted to address.
>>>
>>
>> Here is a proof of concept branch to demonstrate that the simplification
>> actually works: https://github.com/shentok/qemu/commits/pegasos2 (Tested
>> with MorphOS with and without pegasos2.rom).
>
>Does this only work because both the via-ac97 and the PCI interrupts are 
>mapped to the same ISA IRQ and you've only tested sound? The guest could 
>configure each device to use a different IRQ, also mapping them so they share 
>one ISA interrupt. What happens if multiple devices are mapped to IRQ 9 (which 
>is the case on pegasos2 where PCI cards, ac97 and USB all share this IRQ) and 
>more than one such device wants to raise an interrupt at the same time? If you 
>ack the ac97 interrupt but a PCI network card or the USB part still wants to 
>get the CPUs attention the ISA IRQ should remain raised until all devices are 
>serviced.

pci_bus_get_irq_level(), used in via_isa_set_pci_irq(), should handle
exactly that case very well.

>I don't see a way to track the status of all devices in a single qemu_irq 
>which can only be up or down so we need something to store the state of each 
>source.

pci_set_irq() causes pci_bus_change_irq_level() to be called.
pci_bus_change_irq_level() tracks the sum of all irq levels of all
devices attached to a particular pin in irq_count. Have a look at
pci_bus_change_irq_level() and you will understand better.

>My patch adds a state register to each ISA IRQ line for all possible sources 
>which could probably be stored once but then for each change of ISA IRQ status 
>all the mapped devices should be checked and combined so it's easier to store 
>them for each IRQ. Does your approach still work if you play sound, and copy 
>something from network to a USB device at the same time? (I'm not sure mine 
>does not have remaining bugs but I don't think this can be simplified that way 
>but if you can prove it would work I don't mind taking an alternative version 
>but I'm not convinced yet.)

Well, I can't prove that my approach works but unfortunately I can
prove that both our approaches cause a freeze :/ Try:
1. Start `qemu-system-ppc -M pegasos2 -bios pegasos2.rom -rtc
base=localtime -device ati-vga,guest_hwcursor=true,romfile="" -cdrom
morphos-3.17.iso -device usb-mouse -device usb-kbd`
2. Move the mouse while sound is playing
-> Observe the VM to freeze

So there must be an issue somewhere else...

Best regards,
Bernhard
>
>Regards,
>BALATON Zoltan
>
>>> I might have further comments but I think it's enough for now.
>>>
>>> Thanks again for making via-ac97 work!
>>>
>>> Best regards,
>>> Bernhard
>>>
>>

Re: [PATCH v1 2/3] contrib/elf2dmp: move PE dir search to pe_get_data_dir_entry

2023-02-22 Thread Viktor Prutyanov

Hello,

On Wed, Feb 22, 2023 at 10:07 PM Annie.li  wrote:
>
> Hello Viktor,
>
> See my following comments inline,
>
> On 11/29/2022 7:03 PM, Viktor Prutyanov wrote:
> > Move out PE directory search functionality to be reused not only
> > for Debug Directory processing but for arbitrary PE directory.
> >
> > Signed-off-by: Viktor Prutyanov 
> > ---
> >   contrib/elf2dmp/main.c | 66 +++---
> >   1 file changed, 37 insertions(+), 29 deletions(-)
> >
> > diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
> > index 9224764239..f3052b3c64 100644
> > --- a/contrib/elf2dmp/main.c
> > +++ b/contrib/elf2dmp/main.c
> > @@ -333,6 +333,40 @@ static int fill_context(KDDEBUGGER_DATA64 *kdbg,
> >   return 0;
> >   }
> >
> > +static int pe_get_data_dir_entry(uint64_t base, void *start_addr, int idx,
> > +void *entry, size_t size, struct va_space *vs)
> > +{
> > +const char e_magic[2] = "MZ";
> > +const char Signature[4] = "PE\0\0";
> > +IMAGE_DOS_HEADER *dos_hdr = start_addr;
> > +IMAGE_NT_HEADERS64 nt_hdrs;
> > +IMAGE_FILE_HEADER *file_hdr = &nt_hdrs.FileHeader;
> > +IMAGE_OPTIONAL_HEADER64 *opt_hdr = &nt_hdrs.OptionalHeader;
> > +IMAGE_DATA_DIRECTORY *data_dir = nt_hdrs.OptionalHeader.DataDirectory;
> > +
> > +if (memcmp(&dos_hdr->e_magic, e_magic, sizeof(e_magic))) {
> > +return 1;
> > +}
> > +
> > +if (va_space_rw(vs, base + dos_hdr->e_lfanew,
> > +&nt_hdrs, sizeof(nt_hdrs), 0)) {
> > +return 1;
> > +}
> > +
> > +if (memcmp(&nt_hdrs.Signature, Signature, sizeof(Signature)) ||
> > +file_hdr->Machine != 0x8664 || opt_hdr->Magic != 0x020b) {
> > +return 1;
> > +}
> > +
> > +if (va_space_rw(vs,
> > +base + data_dir[idx].VirtualAddress,
> > +entry, size, 0)) {
> > +return 1;
> > +}
> > +
> > +return 0;
> > +}
> > +
> >   static int write_dump(struct pa_space *ps,
> >   WinDumpHeader64 *hdr, const char *name)
> >   {
> > @@ -369,42 +403,16 @@ static int write_dump(struct pa_space *ps,
> >   static int pe_get_pdb_symstore_hash(uint64_t base, void *start_addr,
> >   char *hash, struct va_space *vs)
> >   {
> > -const char e_magic[2] = "MZ";
> > -const char Signature[4] = "PE\0\0";
> >   const char sign_rsds[4] = "RSDS";
> > -IMAGE_DOS_HEADER *dos_hdr = start_addr;
> > -IMAGE_NT_HEADERS64 nt_hdrs;
> > -IMAGE_FILE_HEADER *file_hdr = &nt_hdrs.FileHeader;
> > -IMAGE_OPTIONAL_HEADER64 *opt_hdr = &nt_hdrs.OptionalHeader;
> > -IMAGE_DATA_DIRECTORY *data_dir = nt_hdrs.OptionalHeader.DataDirectory;
> >   IMAGE_DEBUG_DIRECTORY debug_dir;
> >   OMFSignatureRSDS rsds;
> >   char *pdb_name;
> >   size_t pdb_name_sz;
> >   size_t i;
> >
> > -QEMU_BUILD_BUG_ON(sizeof(*dos_hdr) >= ELF2DMP_PAGE_SIZE);
>
> This BUG_ON gets removed due to encapsulating the code into function
> pe_get_data_dir_entry.
>
> Any reason of not keeping this check in pe_get_data_dir_entry?
> > -
> > -if (memcmp(&dos_hdr->e_magic, e_magic, sizeof(e_magic))) {
> > -return 1;
> > -}
> > -
> > -if (va_space_rw(vs, base + dos_hdr->e_lfanew,
> > -&nt_hdrs, sizeof(nt_hdrs), 0)) {
> > -return 1;
> > -}
> > -
> > -if (memcmp(&nt_hdrs.Signature, Signature, sizeof(Signature)) ||
> > -file_hdr->Machine != 0x8664 || opt_hdr->Magic != 0x020b) {
> > -return 1;
> > -}
> > -
> > -printf("Debug Directory RVA = 0x%08"PRIx32"\n",
> > -(uint32_t)data_dir[IMAGE_FILE_DEBUG_DIRECTORY].VirtualAddress);
>
> Or add common log for both Debug and PE directory instead of removing it?

Sounds reasonable, I will send a new version.

Best regards,
Viktor Prutyanov

>
> Thanks
>
> Annie
>
> > -
> > -if (va_space_rw(vs,
> > -base + data_dir[IMAGE_FILE_DEBUG_DIRECTORY].VirtualAddress,
> > -&debug_dir, sizeof(debug_dir), 0)) {
> > +if (pe_get_data_dir_entry(base, start_addr, IMAGE_FILE_DEBUG_DIRECTORY,
> > +&debug_dir, sizeof(debug_dir), vs)) {
> > +eprintf("Failed to get Debug Directory\n");
> >   return 1;
> >   }
> >

[PATCH] hw/smbios: fix field corruption in type 4 table

2023-02-22 Thread Julia Suvorova

Since table type 4 of SMBIOS version 2.6 is shorter than 3.0, the
strings which follow immediately after the struct fields have been
overwritten by unconditional filling of later fields such as core_count2.
Make these fields dependent on the SMBIOS version.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2169904

Signed-off-by: Julia Suvorova 
---
 hw/smbios/smbios.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
index b4243de735..903fd22350 100644
--- a/hw/smbios/smbios.c
+++ b/hw/smbios/smbios.c
@@ -749,14 +749,16 @@ static void smbios_build_type_4_table(MachineState *ms, 
unsigned instance)
 t->core_count = (ms->smp.cores > 255) ? 0xFF : ms->smp.cores;
 t->core_enabled = t->core_count;
 
-t->core_count2 = t->core_enabled2 = cpu_to_le16(ms->smp.cores);
-
 t->thread_count = (ms->smp.threads > 255) ? 0xFF : ms->smp.threads;
-t->thread_count2 = cpu_to_le16(ms->smp.threads);
 
 t->processor_characteristics = cpu_to_le16(0x02); /* Unknown */
 t->processor_family2 = cpu_to_le16(0x01); /* Other */
 
+if (smbios_ep_type == SMBIOS_ENTRY_POINT_TYPE_64) {
+t->core_count2 = t->core_enabled2 = cpu_to_le16(ms->smp.cores);
+t->thread_count2 = cpu_to_le16(ms->smp.threads);
+}
+
 SMBIOS_BUILD_TABLE_POST;
 smbios_type4_count++;
 }
-- 
2.38.1

Re: [PATCH v2 03/20] vfio/migration: Add VFIO migration pre-copy support

2023-02-22 Thread Alex Williamson

On Wed, 22 Feb 2023 19:48:58 +0200
Avihai Horon  wrote:

> Pre-copy support allows the VFIO device data to be transferred while the
> VM is running. This helps to accommodate VFIO devices that have a large
> amount of data that needs to be transferred, and it can reduce migration
> downtime.
> 
> Pre-copy support is optional in VFIO migration protocol v2.
> Implement pre-copy of VFIO migration protocol v2 and use it for devices
> that support it. Full description of it can be found here [1].
> 
> [1]
> https://lore.kernel.org/kvm/20221206083438.37807-3-yish...@nvidia.com/
> 
> Signed-off-by: Avihai Horon 
> ---
>  docs/devel/vfio-migration.rst |  35 +--
>  include/hw/vfio/vfio-common.h |   3 +
>  hw/vfio/common.c  |   6 +-
>  hw/vfio/migration.c   | 175 --
>  hw/vfio/trace-events  |   4 +-
>  5 files changed, 201 insertions(+), 22 deletions(-)
> 
> diff --git a/docs/devel/vfio-migration.rst b/docs/devel/vfio-migration.rst
> index c214c73e28..ba80b9150d 100644
> --- a/docs/devel/vfio-migration.rst
> +++ b/docs/devel/vfio-migration.rst
> @@ -7,12 +7,14 @@ the guest is running on source host and restoring this 
> saved state on the
>  destination host. This document details how saving and restoring of VFIO
>  devices is done in QEMU.
>  
> -Migration of VFIO devices currently consists of a single stop-and-copy phase.
> -During the stop-and-copy phase the guest is stopped and the entire VFIO 
> device
> -data is transferred to the destination.
> -
> -The pre-copy phase of migration is currently not supported for VFIO devices.
> -Support for VFIO pre-copy will be added later on.
> +Migration of VFIO devices consists of two phases: the optional pre-copy 
> phase,
> +and the stop-and-copy phase. The pre-copy phase is iterative and allows to
> +accommodate VFIO devices that have a large amount of data that needs to be
> +transferred. The iterative pre-copy phase of migration allows for the guest 
> to
> +continue whilst the VFIO device state is transferred to the destination, this
> +helps to reduce the total downtime of the VM. VFIO devices can choose to skip
> +the pre-copy phase of migration by not reporting the VFIO_MIGRATION_PRE_COPY
> +flag in VFIO_DEVICE_FEATURE_MIGRATION ioctl.

Or alternatively for the last sentence,

  VFIO devices opt-in to pre-copy support by reporting the
  VFIO_MIGRATION_PRE_COPY flag in the VFIO_DEVICE_FEATURE_MIGRATION
  ioctl.


>  Note that currently VFIO migration is supported only for a single device. 
> This
>  is due to VFIO migration's lack of P2P support. However, P2P support is 
> planned
> @@ -29,10 +31,20 @@ VFIO implements the device hooks for the iterative 
> approach as follows:
>  * A ``load_setup`` function that sets the VFIO device on the destination in
>_RESUMING state.
>  
> +* A ``state_pending_estimate`` function that reports an estimate of the
> +  remaining pre-copy data that the vendor driver has yet to save for the VFIO
> +  device.
> +
>  * A ``state_pending_exact`` function that reads pending_bytes from the vendor
>driver, which indicates the amount of data that the vendor driver has yet 
> to
>save for the VFIO device.
>  
> +* An ``is_active_iterate`` function that indicates ``save_live_iterate`` is
> +  active only when the VFIO device is in pre-copy states.
> +
> +* A ``save_live_iterate`` function that reads the VFIO device's data from the
> +  vendor driver during iterative pre-copy phase.
> +
>  * A ``save_state`` function to save the device config space if it is present.
>  
>  * A ``save_live_complete_precopy`` function that sets the VFIO device in
> @@ -95,8 +107,10 @@ Flow of state changes during Live migration
>  ===
>  
>  Below is the flow of state change during live migration.
> -The values in the brackets represent the VM state, the migration state, and
> +The values in the parentheses represent the VM state, the migration state, 
> and
>  the VFIO device state, respectively.
> +The text in the square brackets represents the flow if the VFIO device 
> supports
> +pre-copy.
>  
>  Live migration save path
>  
> @@ -108,11 +122,12 @@ Live migration save path
>|
>   migrate_init spawns migration_thread
>  Migration thread then calls each device's .save_setup()
> -   (RUNNING, _SETUP, _RUNNING)
> +  (RUNNING, _SETUP, _RUNNING [_PRE_COPY])
>|
> -  (RUNNING, _ACTIVE, _RUNNING)
> - If device is active, get pending_bytes by .state_pending_exact()
> +  (RUNNING, _ACTIVE, _RUNNING [_PRE_COPY])
> +  If device is active, get pending_bytes by 
> .state_pending_{estimate,exact}()
>If total pending_bytes >= threshold_size, call .save_live_iterate()
> +  [Data of VFIO device for pre-copy phase is c

Re: [PATCH v2 5/7] targer/arm: Inform helpers whether a PAC instruction is 'combined'

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

  static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
-   ARMPACKey *key, bool data, int keynumber)
+   ARMPACKey *key, bool data, int keynumber,
+   bool is_combined)


Add 'ra' argument here at the same time, to avoid modifying calls to this function in 
successive patches.


Otherwise,
Reviewed-by: Richard Henderson 


r~

Re: [PATCH v2 00/20] vfio: Add migration pre-copy support and device dirty tracking

2023-02-22 Thread Alex Williamson



There are various errors running this through the CI on gitlab.

This one seems bogus but needs to be resolved regardless:

https://gitlab.com/alex.williamson/qemu/-/jobs/3817940731
FAILED: libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o 
2786s390x-linux-gnu-gcc -m64 -Ilibqemu-aarch64-softmmu.fa.p -I. -I.. 
-Itarget/arm -I../target/arm -Iqapi -Itrace -Iui -Iui/shader 
-I/usr/include/pixman-1 -I/usr/include/capstone -I/usr/include/glib-2.0 
-I/usr/lib/s390x-linux-gnu/glib-2.0/include -fdiagnostics-color=auto -Wall 
-Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem 
/builds/alex.williamson/qemu/linux-headers -isystem linux-headers -iquote . 
-iquote /builds/alex.williamson/qemu -iquote 
/builds/alex.williamson/qemu/include -iquote 
/builds/alex.williamson/qemu/tcg/s390x -pthread -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-fno-strict-aliasing -fno-common -fwrapv -Wundef -Wwrite-strings 
-Wmissing-prototypes -Wstrict-prototypes -Wredundant-decls 
-Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security 
-Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs 
-Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 
-Wmissing-format-attribute -Wno-missing-include-dirs -Wno-shift-negative-value 
-Wno-psabi -fstack-protector-strong -fPIE -isystem../linux-headers 
-isystemlinux-headers -DNEED_CPU_H 
'-DCONFIG_TARGET="aarch64-softmmu-config-target.h"' 
'-DCONFIG_DEVICES="aarch64-softmmu-config-devices.h"' -MD -MQ 
libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o -MF 
libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o.d -o 
libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o -c ../hw/vfio/common.c
2787../hw/vfio/common.c: In function ‘vfio_listener_log_global_start’:
2788../hw/vfio/common.c:1772:8: error: ‘ret’ may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
2789 1772 | if (ret) {
2790  |^

32-bit builds have some actual errors though:

https://gitlab.com/alex.williamson/qemu/-/jobs/3817940719
FAILED: libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o 
2601cc -m32 -Ilibqemu-aarch64-softmmu.fa.p -I. -I.. -Itarget/arm 
-I../target/arm -Iqapi -Itrace -Iui -Iui/shader -I/usr/include/pixman-1 
-I/usr/include/glib-2.0 -I/usr/lib/glib-2.0/include -I/usr/include/sysprof-4 
-fdiagnostics-color=auto -Wall -Winvalid-pch -Werror -std=gnu11 -O2 -g -isystem 
/builds/alex.williamson/qemu/linux-headers -isystem linux-headers -iquote . 
-iquote /builds/alex.williamson/qemu -iquote 
/builds/alex.williamson/qemu/include -iquote 
/builds/alex.williamson/qemu/tcg/i386 -pthread -U_FORTIFY_SOURCE 
-D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE 
-fno-strict-aliasing -fno-common -fwrapv -Wundef -Wwrite-strings 
-Wmissing-prototypes -Wstrict-prototypes -Wredundant-decls 
-Wold-style-declaration -Wold-style-definition -Wtype-limits -Wformat-security 
-Wformat-y2k -Winit-self -Wignored-qualifiers -Wempty-body -Wnested-externs 
-Wendif-labels -Wexpansion-to-defined -Wimplicit-fallthrough=2 
-Wmissing-format-attribute -Wno-missing-include-dirs -Wno-shift-negative-value 
-Wno-psabi -fstack-protector-strong -fPIE -isystem../linux-headers 
-isystemlinux-headers -DNEED_CPU_H 
'-DCONFIG_TARGET="aarch64-softmmu-config-target.h"' 
'-DCONFIG_DEVICES="aarch64-softmmu-config-devices.h"' -MD -MQ 
libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o -MF 
libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o.d -o 
libqemu-aarch64-softmmu.fa.p/hw_vfio_common.c.o -c ../hw/vfio/common.c
2602../hw/vfio/common.c: In function 
'vfio_device_feature_dma_logging_start_create':
2603../hw/vfio/common.c:1572:27: error: cast from pointer to integer of 
different size [-Werror=pointer-to-int-cast]
2604 1572 | control->ranges = (uint64_t)ranges;
2605  |   ^
2606../hw/vfio/common.c:1596:23: error: cast from pointer to integer of 
different size [-Werror=pointer-to-int-cast]
2607 1596 | control->ranges = (uint64_t)ranges;
2608  |   ^
2609../hw/vfio/common.c: In function 
'vfio_device_feature_dma_logging_start_destroy':
2610../hw/vfio/common.c:1620:9: error: cast to pointer from integer of 
different size [-Werror=int-to-pointer-cast]
2611 1620 | (struct vfio_device_feature_dma_logging_range 
*)control->ranges;
2612  | ^
2613../hw/vfio/common.c: In function 'vfio_device_dma_logging_report':
2614../hw/vfio/common.c:1810:22: error: cast from pointer to integer of 
different size [-Werror=pointer-to-int-cast]
2615 1810 | report->bitmap = (uint64_t)bitmap;
2616  |  ^

Thanks,
Alex

Re: [PATCH v2 4/7] target/arm: Implement v8.3 Pauth2

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

+result = ((ptr ^ pac) & xor_mask) | (ptr & ~xor_mask);


Simplifies to

result = ptr ^ (pac & xor_mask);

which, IMO is also clearer.

Otherwise,
Reviewed-by: Richard Henderson 


r~

Re: [PATCH v3 1/1] vhost-user-fs: add migration type property

2023-02-22 Thread Anton Kuchin


On 22/02/2023 22:21, Michael S. Tsirkin wrote:

On Wed, Feb 22, 2023 at 08:25:19PM +0200, Anton Kuchin wrote:

On 22/02/2023 19:12, Michael S. Tsirkin wrote:

On Wed, Feb 22, 2023 at 07:05:47PM +0200, Anton Kuchin wrote:

On 22/02/2023 18:51, Michael S. Tsirkin wrote:

On Wed, Feb 22, 2023 at 06:49:10PM +0200, Anton Kuchin wrote:

On 22/02/2023 17:14, Vladimir Sementsov-Ogievskiy wrote:

On 22.02.23 17:25, Anton Kuchin wrote:

+static int vhost_user_fs_pre_save(void *opaque)
+{
+    VHostUserFS *fs = opaque;
+    g_autofree char *path = object_get_canonical_path(OBJECT(fs));
+
+    switch (fs->migration_type) {
+    case VHOST_USER_MIGRATION_TYPE_NONE:
+    error_report("Migration is blocked by device %s", path);
+    break;
+    case VHOST_USER_MIGRATION_TYPE_EXTERNAL:
+    return 0;
+    default:
+    error_report("Migration type '%s' is not
supported by device %s",
+ VhostUserMigrationType_str(fs->migration_type), path);
+    break;
+    }
+
+    return -1;
+}

Should we also add this as .pre_load, to force user select
correct migration_type on target too?

In fact, I would claim we only want pre_load.
When qemu is started on destination we know where it's migrated
from so this flag can be set.
When qemu is started on source we generally do not yet know so
we don't know whether it's safe to set this flag.

But destination is a "source" for next migration, so there shouldn't be
real difference.
The new property has ".realized_set_allowed = true", so, as I understand
it may be changed at any time, so that's not a problem.

Yes, exactly. So destination's property sets not how it will handle this
incoming
migration but the future outgoing one.

How do you know where you are going to migrate though?
I think you don't.
Setting it on source is better since we know where we
are migrating from.

Yes, I don't know where I'm going to migrate to. This is why property
affects only how source saves state on outgoing migration.

Um. I don't get the logic.

For this feature to work we need orchestrator to manage the migration. And
we
generally assume that it is responsibility of orchestrator to ensure
matching
properties on source and destination.
As orchestrator manages both sides of migration it can set option (and we
can
check it) on either source or destination. Now it's not important which side
we
select, because now the option is essentially binary allow/deny (but IMHO it
is much better to refuse source to migrate than find later that state can't
be
loaded by destination, in case of file migration this becomes especially
painful).

But there are plans to add internal migration option (extract FUSE state
from
backend and transfer it in QEMU migration stream), and that's where
setting/checking
on source becomes important because it will rely on this property to decide
if
extra state form backend needs to be put in the migration stream subsection.


If we do internal migration that will be a different property
which has to match on source *and* destination.


I'm not sure if we need other property. Initial idea was to allow
orchestrator setup which part of state qemu should put to stream
that will be sufficient to restore VM on destination.
But this depends on how external migration will be implemented.





If you are concerned about orchestrator breaking assumption of matching
properties
on source and destination this is not really supported AFAIK but I don't
think we
need to punish it for this, maybe it has its reasons: I can imagine scenario
where orchestrator could want to migrate from source with
'migration=external'
to destination with 'migration=none' to ensure that destination can't be
migrated further.

No. I am concerned about a simple practical matter:
- I decide to restart qemu on the same host - so I need to enable
   migration
- Later I decide to migrate qemu to another host - this should be
   blocked


Property on source does not satisfy both at the same time.
Property on destination does.


If destination QEMUs on local and remote hosts have same properties how 
can we

write check that passes on the same host and fails on remote?
Sorry, I don't understand how qemu can help to handle this. It knows nothing
about the hosts so this is responsibility of management to software to know
where it can migrate and configure it appropriately.

Maybe I didn't understand your scenario or what you propose to check on
destination. Could you explain a bit more?








This property selects if VM can migrate and if it can what should
qemu put
to the migration stream. So we select on source what type of
migration is
allowed for this VM, destination can't check anything at load time.

OK, so the new field "migration" regulates only outgoing migration and
do nothing for incoming. On incoming migration the migration stream
itself defines the type of device migration.
Worth mentioning in doc?

Good point. I don't think this deserves a respin but if I have to send v4
I'll include
clarification in i

Re: [PATCH v2 3/7] target/arm: Implement v8.3 EnhancedPAC

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

+if (cpu_isar_feature(aa64_pauth_epac, env_archcpu(env))) {


It might be cleaner, especially later, to have

ARMCPU *cpu = env_archcpu(env);

at the top of the function.


r~

Re: [PATCH v2 3/7] target/arm: Implement v8.3 EnhancedPAC

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

Signed-off-by: Aaron Lindsay
Reviewed-by: Peter Maydell
---
  target/arm/pauth_helper.c | 14 +-
  1 file changed, 9 insertions(+), 5 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH v2 2/7] target/arm: Implement v8.3 QARMA3 PAC cipher

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

-workingval = pac_sub(workingval);
+if (isqarma3)
+workingval = pac_sub1(workingval);
+else
+workingval = pac_sub(workingval);


Braces required for all if+else.  Multiple instances.

Otherwise,
Reviewed-by: Richard Henderson 


r~

Re: [PATCH v2 1/7] target/arm: v8.3 PAC ID_AA64ISAR[12] feature-detection

2023-02-22 Thread Richard Henderson


On 2/22/23 09:35, Aaron Lindsay wrote:

+static inline bool isar_feature_aa64_pauth_arch_qarma3(const ARMISARegisters 
*id)
+{
+/*
+ * Return true if pauth is enabled with the architected QARMA3 algorithm.
+ * QEMU will always set APA3+GPA3 to the same value.
+ */


This language isn't quite right, since GPA3 only defines values 0 and 1.
Perhaps "to the same result"?


+static inline uint8_t isar_feature_pauth_get_features(const ARMISARegisters 
*id)


'int' is a better generic result, as 'uint8_t' is 'unsigned char' to the debugger and 
generally printed as such.



+if (isar_feature_aa64_pauth_arch_qarma5(id))
+return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, APA);
+else if (isar_feature_aa64_pauth_arch_qarma3(id))
+return FIELD_EX64(id->id_aa64isar2, ID_AA64ISAR2, APA3);
+else
+return FIELD_EX64(id->id_aa64isar1, ID_AA64ISAR1, API);


Braces with if+else, always.

That said, exactly one of these fields is allowed to be non-zero, so we can just 
unconditionally OR them all together.



+static inline bool isar_feature_aa64_pauth_epac(const ARMISARegisters *id)
+{
+/*
+ * Note that unlike most AArch64 features, EPAC is treated (in the ARM
+ * psedocode, at least) as not being implemented by larger values of this
+ * field. Our usage of '>=' rather than '==' here causes our implementation
+ * of PAC logic to diverge slightly from ARM pseudocode.
+ */


I find this comment scary -- "diverge slightly"?

All I need is once sentence to indicate how this is mitigated (by testing pauth2 first 
where required?), or "See function_foo" (where there is more commentary), or something.



diff --git a/target/arm/helper.c b/target/arm/helper.c
index 72b37b7cf1..448ebf8301 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -8028,11 +8028,11 @@ void register_cp_regs_for_features(ARMCPU *cpu)
.access = PL1_R, .type = ARM_CP_CONST,
.accessfn = access_aa64_tid3,
.resetvalue = cpu->isar.id_aa64isar1 },
-{ .name = "ID_AA64ISAR2_EL1_RESERVED", .state = ARM_CP_STATE_AA64,
+{ .name = "ID_AA64ISAR2_EL1", .state = ARM_CP_STATE_AA64,
.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 6, .opc2 = 2,
.access = PL1_R, .type = ARM_CP_CONST,
.accessfn = access_aa64_tid3,
-  .resetvalue = 0 },
+  .resetvalue = cpu->isar.id_aa64isar2 },


All the code adding aa64isar2 should be a separate patch.

You've missed initializing it in kvm_arm_get_host_cpu_features and 
hvf_arm_get_host_cpu_features.



r~

Re: [PATCH v4 3/9] hw/i386/pc_q35: Reuse machine parameter

2023-02-22 Thread Michael S. Tsirkin

On Wed, Feb 22, 2023 at 06:52:02PM +0100, Bernhard Beschow wrote:
> Am 22. Februar 2023 11:03:38 UTC schrieb "Philippe Mathieu-Daudé"
> :
> >On 13/2/23 17:19, Bernhard Beschow wrote:
> >> Signed-off-by: Bernhard Beschow 
> >> Reviewed-by: Thomas Huth 
> >> ---
> >>   hw/i386/pc_q35.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> >> index 66cd718b70..dee2b38474 100644
> >> --- a/hw/i386/pc_q35.c
> >> +++ b/hw/i386/pc_q35.c
> >> @@ -218,7 +218,7 @@ static void pc_q35_init(MachineState *machine)
> >>   pc_memory_init(pcms, get_system_memory(), rom_memory, &ram_memory,
> >>  pci_hole64_size);
> >>   -object_property_add_child(qdev_get_machine(), "q35", 
> >> OBJECT(q35_host));
> >> +object_property_add_child(OBJECT(machine), "q35", OBJECT(q35_host));
> >>   object_property_set_link(OBJECT(q35_host), MCH_HOST_PROP_RAM_MEM,
> >>OBJECT(ram_memory), NULL);
> >>   object_property_set_link(OBJECT(q35_host), MCH_HOST_PROP_PCI_MEM,
> >
> >Reviewed-by: Philippe Mathieu-Daudé 
> >
> >Long term we should duplicate/extract Q35MachineState from
> >PCMachineState and add a Q35PCIHost field, then use object_initialize_child; 
> >removing this object_property_add_child()
> >call.
> 
> The Q35 and PC machines duplicate a lot of code indeed. So I was
> thinking more along the lines of consolidating with pc_piix ;)

The reason is that we are trying to limit changes to pc_piix and
focus development on Q35.

> The
> idea would be to get a peek preview into a configuration-driven future
> where the PCI host bridges (Q35 or I440FX) are dynamically
> instantiated based on configuration data. They would also be
> configured through their QOM interfaces only.
> 
> I've submitted a series where the Q35 host bridge gets a QOM cleanup
> [1] and I've got a series locally resolving i440fx_init(). Both series
> combined bring these two device models close together regarding their
> QOM interface. I've not submitted the i440fx series yet since it is
> blocked by this series.
> 
> One further step for pc_q35 and pc_piix consolidation would be to
> factor ICH9 PCI devices (not functions!) into self-contained models,
> like is underway with PIIX(3). I've started with ICH9 cleanup already
> [2] and I'm waiting for the PIIX consolidation to land in order to be
> able to make more progress here.
> 
> Note that pc_q35 and pc_piix consolidation is just an idea for now
> which could also turn out to be a bad one. If the two machines just
> ended up sharing more code that could IMO be considered a success as
> well.
> 
> Best regards,
> Bernhard
> 
> [1] https://patchew.org/QEMU/20230214131441.101760-1-shen...@gmail.com/
> [2] https://patchew.org/QEMU/20230213173033.98762-1-shen...@gmail.com/

Re: [PATCH 5/5] hw: Remove mentions of NDEBUG

2023-02-22 Thread Michael S. Tsirkin

On Wed, Feb 22, 2023 at 08:43:35AM -1000, Richard Henderson wrote:
> On 2/22/23 06:28, Michael S. Tsirkin wrote:
> > On Wed, Feb 22, 2023 at 05:11:36PM +0100, Philippe Mathieu-Daudé wrote:
> > > On 22/2/23 13:05, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 22, 2023 at 12:25:20AM +0100, Philippe Mathieu-Daudé wrote:
> > > > > Since commit 262a69f428 ("osdep.h: Prohibit disabling
> > > > > assert() in supported builds") 'NDEBUG' can not be defined.
> > > > > 
> > > > > Signed-off-by: Philippe Mathieu-Daudé 
> > > > 
> > > > this exactly says NDEBUG is not allowed. why are you removing this?
> > > 
> > > The project can not be built with NDEBUG. There is no point in
> > > mentioning it in each individual function.
> > 
> > the reason we mention it is because there are security implications
> > if we don't.
> 
> Yes.  However that's not what the text being removed suggests:
> 
> > > > > - * This is just one thing (there are probably more) that must be
> > > > > - * fixed before we can allow NDEBUG compilation.
> 
> This suggests that we *will* allow NDEBUG, once a few things are fixed.
> 
> I strongly approve of this text being removed.
> 
> 
> r~


OK I think it's a good idea to replace it with something like

/* Note: Do not remove this assertion, doing so will break qemu security! */

-- 
MST

Re: [PATCH v3 1/1] vhost-user-fs: add migration type property

2023-02-22 Thread Michael S. Tsirkin

On Wed, Feb 22, 2023 at 08:25:19PM +0200, Anton Kuchin wrote:
> On 22/02/2023 19:12, Michael S. Tsirkin wrote:
> > On Wed, Feb 22, 2023 at 07:05:47PM +0200, Anton Kuchin wrote:
> > > On 22/02/2023 18:51, Michael S. Tsirkin wrote:
> > > > On Wed, Feb 22, 2023 at 06:49:10PM +0200, Anton Kuchin wrote:
> > > > > On 22/02/2023 17:14, Vladimir Sementsov-Ogievskiy wrote:
> > > > > > On 22.02.23 17:25, Anton Kuchin wrote:
> > > > > > > > > > +static int vhost_user_fs_pre_save(void *opaque)
> > > > > > > > > > +{
> > > > > > > > > > +    VHostUserFS *fs = opaque;
> > > > > > > > > > +    g_autofree char *path = 
> > > > > > > > > > object_get_canonical_path(OBJECT(fs));
> > > > > > > > > > +
> > > > > > > > > > +    switch (fs->migration_type) {
> > > > > > > > > > +    case VHOST_USER_MIGRATION_TYPE_NONE:
> > > > > > > > > > +    error_report("Migration is blocked by device %s", 
> > > > > > > > > > path);
> > > > > > > > > > +    break;
> > > > > > > > > > +    case VHOST_USER_MIGRATION_TYPE_EXTERNAL:
> > > > > > > > > > +    return 0;
> > > > > > > > > > +    default:
> > > > > > > > > > +    error_report("Migration type '%s' is not
> > > > > > > > > > supported by device %s",
> > > > > > > > > > + VhostUserMigrationType_str(fs->migration_type), path);
> > > > > > > > > > +    break;
> > > > > > > > > > +    }
> > > > > > > > > > +
> > > > > > > > > > +    return -1;
> > > > > > > > > > +}
> > > > > > > > > Should we also add this as .pre_load, to force user select
> > > > > > > > > correct migration_type on target too?
> > > > > > > > In fact, I would claim we only want pre_load.
> > > > > > > > When qemu is started on destination we know where it's migrated
> > > > > > > > from so this flag can be set.
> > > > > > > > When qemu is started on source we generally do not yet know so
> > > > > > > > we don't know whether it's safe to set this flag.
> > > > > > But destination is a "source" for next migration, so there 
> > > > > > shouldn't be
> > > > > > real difference.
> > > > > > The new property has ".realized_set_allowed = true", so, as I 
> > > > > > understand
> > > > > > it may be changed at any time, so that's not a problem.
> > > > > Yes, exactly. So destination's property sets not how it will handle 
> > > > > this
> > > > > incoming
> > > > > migration but the future outgoing one.
> > > > How do you know where you are going to migrate though?
> > > > I think you don't.
> > > > Setting it on source is better since we know where we
> > > > are migrating from.
> > > Yes, I don't know where I'm going to migrate to. This is why property
> > > affects only how source saves state on outgoing migration.
> > Um. I don't get the logic.
> 
> For this feature to work we need orchestrator to manage the migration. And
> we
> generally assume that it is responsibility of orchestrator to ensure
> matching
> properties on source and destination.
> As orchestrator manages both sides of migration it can set option (and we
> can
> check it) on either source or destination. Now it's not important which side
> we
> select, because now the option is essentially binary allow/deny (but IMHO it
> is much better to refuse source to migrate than find later that state can't
> be
> loaded by destination, in case of file migration this becomes especially
> painful).
> 
> But there are plans to add internal migration option (extract FUSE state
> from
> backend and transfer it in QEMU migration stream), and that's where
> setting/checking
> on source becomes important because it will rely on this property to decide
> if
> extra state form backend needs to be put in the migration stream subsection.


If we do internal migration that will be a different property
which has to match on source *and* destination.


> If you are concerned about orchestrator breaking assumption of matching
> properties
> on source and destination this is not really supported AFAIK but I don't
> think we
> need to punish it for this, maybe it has its reasons: I can imagine scenario
> where orchestrator could want to migrate from source with
> 'migration=external'
> to destination with 'migration=none' to ensure that destination can't be
> migrated further.

No. I am concerned about a simple practical matter:
- I decide to restart qemu on the same host - so I need to enable
  migration
- Later I decide to migrate qemu to another host - this should be
  blocked


Property on source does not satisfy both at the same time.
Property on destination does.



> > 
> > 
> > > > > > > This property selects if VM can migrate and if it can what should
> > > > > > > qemu put
> > > > > > > to the migration stream. So we select on source what type of
> > > > > > > migration is
> > > > > > > allowed for this VM, destination can't check anything at load 
> > > > > > > time.
> > > > > > OK, so the new field "migration" regulates only outgoing migration 
> > > > > > and
> > > > > > do nothing for incoming. On incoming migration the migration stream
> > > > >

Re: [PATCH v6 0/3] block/rbd: Add support for layered encryption

2023-02-22 Thread Ilya Dryomov

On Mon, Jan 30, 2023 at 2:16 PM Ilya Dryomov  wrote:
>
> On Sun, Jan 29, 2023 at 12:31 PM o...@il.ibm.com
>  wrote:
> >
> > v6: nit fixes
> > v5: nit fixes
> > v4: split to multiple commits
> > add support for more than just luks-any in layered encryption
> > nit fixes
> > v3: further nit fixes suggested by @idryomov
> > v2: nit fixes suggested by @idryomov
> >
> > Or Ozeri (3):
> >   block/rbd: Remove redundant stack variable passphrase_len
> >   block/rbd: Add luks-any encryption opening option
> >   block/rbd: Add support for layered encryption
> >
> >  block/rbd.c  | 188 ---
> >  qapi/block-core.json |  31 ++-
> >  2 files changed, 205 insertions(+), 14 deletions(-)
> >
> > --
> > 2.25.1
> >
>
> Reviewed-by: Ilya Dryomov 

Hi Kevin, Hanna,

What is the status of this set?  I see it on patchw and also see that
my review got picked up but it's not clear whether there is something
else to do here:

https://patchew.org/QEMU/20230129113120.722708-1-...@oro.sl.cloud9.ibm.com/

I'm CCing Daniel who commented on previous postings of this set in case
an additional review is needed.

Thanks,

Ilya

Re: [PATCH v1 3/3] contrib/elf2dmp: add PE name check and Windows Server 2022 support

2023-02-22 Thread Viktor Prutyanov

On Wed, Feb 22, 2023 at 10:07 PM Annie.li  wrote:
>
>
> On 11/29/2022 7:03 PM, Viktor Prutyanov wrote:
> > Since its inception elf2dmp has checked MZ signatures within an
> > address space above IDT[0] interrupt vector and took first PE image
> > found as Windows Kernel.
> > But in Windows Server 2022 memory dump this address space range is
> > full of invalid PE fragments and the tool must check that PE image
> > is 'ntoskrnl.exe' actually.
> > So, introduce additional validation by checking image name from
> > Export Directory against 'ntoskrnl.exe'.
> >
> > Signed-off-by: Viktor Prutyanov 
> > Tested-by: Yuri Benditovich 
> > ---
> >   contrib/elf2dmp/main.c | 28 ++--
> >   contrib/elf2dmp/pe.h   | 15 +++
> >   2 files changed, 41 insertions(+), 2 deletions(-)
> >
> > diff --git a/contrib/elf2dmp/main.c b/contrib/elf2dmp/main.c
> > index f3052b3c64..f7de82a03e 100644
> > --- a/contrib/elf2dmp/main.c
> > +++ b/contrib/elf2dmp/main.c
> > @@ -17,6 +17,7 @@
> >
> >   #define SYM_URL_BASE"https://msdl.microsoft.com/download/symbols/";
> >   #define PDB_NAME"ntkrnlmp.pdb"
> > +#define PE_NAME "ntoskrnl.exe"
>
> As what has been clarified earlier in the meeting, this elf2dmp is only for
> 64bits systems, so the name "ntoskrnl.exe" suffices here. Otherwise, it
> won't work
> for 32bits PAE systems.
>
> A question about elf2dmp on ARM platform, has it been validated there?
>
> Thanks
>
> Annie
>
> >
> >   #define INITIAL_MXCSR   0x1f80
> >
> > @@ -400,6 +401,25 @@ static int write_dump(struct pa_space *ps,
> >   return fclose(dmp_file);
> >   }
> >
> > +static bool pe_check_export_name(uint64_t base, void *start_addr,
> > +struct va_space *vs)
> > +{
> > +IMAGE_EXPORT_DIRECTORY export_dir;
> > +const char *pe_name;
> > +
> > +if (pe_get_data_dir_entry(base, start_addr, 
> > IMAGE_FILE_EXPORT_DIRECTORY,
> > +&export_dir, sizeof(export_dir), vs)) {
> > +return false;
> > +}
> > +
> > +pe_name = va_space_resolve(vs, base + export_dir.Name);
> > +if (!pe_name) {
> > +return false;
> > +}
> > +
> > +return !strcmp(pe_name, PE_NAME);
> > +}
> > +
> >   static int pe_get_pdb_symstore_hash(uint64_t base, void *start_addr,
> >   char *hash, struct va_space *vs)
> >   {
> > @@ -484,6 +504,7 @@ int main(int argc, char *argv[])
> >   uint64_t KdDebuggerDataBlock;
> >   KDDEBUGGER_DATA64 *kdbg;
> >   uint64_t KdVersionBlock;
> > +bool kernel_found = false;
> >
> >   if (argc != 3) {
> >   eprintf("usage:\n\t%s elf_file dmp_file\n", argv[0]);
> > @@ -531,11 +552,14 @@ int main(int argc, char *argv[])
> >   }
> >
> >   if (*(uint16_t *)nt_start_addr == 0x5a4d) { /* MZ */
> > -break;
> > +if (pe_check_export_name(KernBase, nt_start_addr, &vs)) {
> > +kernel_found = true;
> > +break;
> > +}
> >   }
> >   }
> >
> > -if (!nt_start_addr) {
> > +if (!kernel_found) {
> >   eprintf("Failed to find NT kernel image\n");
> >   err = 1;
> >   goto out_ps;
> > diff --git a/contrib/elf2dmp/pe.h b/contrib/elf2dmp/pe.h
> > index 807d006364..71126af1ac 100644
> > --- a/contrib/elf2dmp/pe.h
> > +++ b/contrib/elf2dmp/pe.h
> > @@ -88,6 +88,20 @@ typedef struct IMAGE_NT_HEADERS64 {
> >   IMAGE_OPTIONAL_HEADER64 OptionalHeader;
> >   } __attribute__ ((packed)) IMAGE_NT_HEADERS64;
> >
> > +typedef struct IMAGE_EXPORT_DIRECTORY {
> > +uint32_tCharacteristics;
> > +uint32_tTimeDateStamp;
> > +uint16_tMajorVersion;
> > +uint16_tMinorVersion;
> > +uint32_tName;
> > +uint32_tBase;
> > +uint32_tNumberOfFunctions;
> > +uint32_tNumberOfNames;
> > +uint32_tAddressOfFunctions;
> > +uint32_tAddressOfNames;
> > +uint32_tAddressOfNameOrdinals;
> > +} __attribute__ ((packed)) IMAGE_EXPORT_DIRECTORY;
> > +
> >   typedef struct IMAGE_DEBUG_DIRECTORY {
> >   uint32_t Characteristics;
> >   uint32_t TimeDateStamp;
> > @@ -102,6 +116,7 @@ typedef struct IMAGE_DEBUG_DIRECTORY {
> >   #define IMAGE_DEBUG_TYPE_CODEVIEW   2
> >   #endif
> >
> > +#define IMAGE_FILE_EXPORT_DIRECTORY 0
> >   #define IMAGE_FILE_DEBUG_DIRECTORY  6
> >
> >   typedef struct guid_t {

Hi Annie,

Thank you for the review!
At the moment, elf2dmp only addresses the x86_64 platform.

Best regards,
Viktor Prutyanov

[PATCH v2 6/7] target/arm: Implement v8.3 FPAC and FPACCOMBINE

2023-02-22 Thread Aaron Lindsay

Signed-off-by: Aaron Lindsay 
---
 target/arm/pauth_helper.c | 35 ++-
 target/arm/syndrome.h |  7 +++
 2 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/target/arm/pauth_helper.c b/target/arm/pauth_helper.c
index 96770d7860..db6cf9b5bc 100644
--- a/target/arm/pauth_helper.c
+++ b/target/arm/pauth_helper.c
@@ -388,9 +388,24 @@ static uint64_t pauth_original_ptr(uint64_t ptr, 
ARMVAParameters param)
 return deposit64(ptr, bot_pac_bit, top_pac_bit - bot_pac_bit, extfield);
 }
 
+static G_NORETURN
+void pauth_fail_exception(CPUARMState *env, bool data, int keynumber, 
uintptr_t ra)
+{
+int target_el = arm_current_el(env);
+if (target_el == 0) {
+uint64_t hcr = arm_hcr_el2_eff(env);
+if (arm_is_el2_enabled(env) && (hcr & HCR_TGE))
+target_el = 2;
+else
+target_el = 1;
+}
+
+raise_exception_ra(env, EXCP_UDEF, syn_pacfail(data, keynumber), 
target_el, ra);
+}
+
 static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, uint64_t modifier,
ARMPACKey *key, bool data, int keynumber,
-   bool is_combined)
+   uintptr_t ra, bool is_combined)
 {
 ARMMMUIdx mmu_idx = arm_stage1_mmu_idx(env);
 ARMVAParameters param = aa64_va_parameters(env, ptr, mmu_idx, data);
@@ -406,6 +421,16 @@ static uint64_t pauth_auth(CPUARMState *env, uint64_t ptr, 
uint64_t modifier,
 uint64_t xor_mask = MAKE_64BIT_MASK(bot_bit, top_bit - bot_bit + 1) &
 ~MAKE_64BIT_MASK(55, 1);
 result = ((ptr ^ pac) & xor_mask) | (ptr & ~xor_mask);
+if (cpu_isar_feature(aa64_fpac_combine, env_archcpu(env)) ||
+(cpu_isar_feature(aa64_fpac, env_archcpu(env)) &&
+ !is_combined)) {
+int fpac_top = param.tbi ? 55 : 64;
+uint64_t fpac_mask = MAKE_64BIT_MASK(bot_bit, fpac_top - bot_bit);
+test = (result ^ sextract64(result, 55, 1)) & fpac_mask;
+if (unlikely(test)) {
+pauth_fail_exception(env, data, keynumber, ra);
+}
+}
 } else {
 test = (pac ^ ptr) & ~MAKE_64BIT_MASK(55, 1);
 if (unlikely(extract64(test, bot_bit, top_bit - bot_bit))) {
@@ -519,7 +544,7 @@ static uint64_t pauth_autia(CPUARMState *env, uint64_t x, 
uint64_t y,
 return x;
 }
 pauth_check_trap(env, el, ra);
-return pauth_auth(env, x, y, &env->keys.apia, false, 0, is_combined);
+return pauth_auth(env, x, y, &env->keys.apia, false, 0, ra, is_combined);
 }
 
 uint64_t HELPER(autia)(CPUARMState *env, uint64_t x, uint64_t y)
@@ -540,7 +565,7 @@ static uint64_t pauth_autib(CPUARMState *env, uint64_t x, 
uint64_t y,
 return x;
 }
 pauth_check_trap(env, el, ra);
-return pauth_auth(env, x, y, &env->keys.apib, false, 1, is_combined);
+return pauth_auth(env, x, y, &env->keys.apib, false, 1, ra, is_combined);
 }
 
 uint64_t HELPER(autib)(CPUARMState *env, uint64_t x, uint64_t y)
@@ -561,7 +586,7 @@ static uint64_t pauth_autda(CPUARMState *env, uint64_t x, 
uint64_t y,
 return x;
 }
 pauth_check_trap(env, el, ra);
-return pauth_auth(env, x, y, &env->keys.apda, true, 0, is_combined);
+return pauth_auth(env, x, y, &env->keys.apda, true, 0, ra, is_combined);
 }
 
 uint64_t HELPER(autda)(CPUARMState *env, uint64_t x, uint64_t y)
@@ -582,7 +607,7 @@ static uint64_t pauth_autdb(CPUARMState *env, uint64_t x, 
uint64_t y,
 return x;
 }
 pauth_check_trap(env, el, ra);
-return pauth_auth(env, x, y, &env->keys.apdb, true, 1, is_combined);
+return pauth_auth(env, x, y, &env->keys.apdb, true, 1, ra, is_combined);
 }
 
 uint64_t HELPER(autdb)(CPUARMState *env, uint64_t x, uint64_t y)
diff --git a/target/arm/syndrome.h b/target/arm/syndrome.h
index 73df5e3793..99ed4c7d3d 100644
--- a/target/arm/syndrome.h
+++ b/target/arm/syndrome.h
@@ -48,6 +48,7 @@ enum arm_exception_class {
 EC_AA64_SMC   = 0x17,
 EC_SYSTEMREGISTERTRAP = 0x18,
 EC_SVEACCESSTRAP  = 0x19,
+EC_PACFAIL= 0x1c,
 EC_SMETRAP= 0x1d,
 EC_INSNABORT  = 0x20,
 EC_INSNABORT_SAME_EL  = 0x21,
@@ -221,6 +222,12 @@ static inline uint32_t syn_smetrap(SMEExceptionType etype, 
bool is_16bit)
 | (is_16bit ? 0 : ARM_EL_IL) | etype;
 }
 
+static inline uint32_t syn_pacfail(bool data, int keynumber)
+{
+int error_code = ((data ? 1 : 0) << 1) | (keynumber);
+return (EC_PACFAIL << ARM_EL_EC_SHIFT) | ARM_EL_IL | error_code;
+}
+
 static inline uint32_t syn_pactrap(void)
 {
 return EC_PACTRAP << ARM_EL_EC_SHIFT;
-- 
2.25.1

1 2 3 4 >

1 - 100 of 306 matches

Mail list logo