date:20230516

[PATCH] acpi/tests/avocado/bits: enable bios bits avocado tests on gitlab CI pipeline

2023-05-16 Thread Ani Sinha

Biosbits avocado tests on gitlab has thus far been disabled because some
packages needed by this test was missing in the container images used by gitlab
CI. These packages have now been added with the commit:

da9000784c90d ("tests/lcitool: Add mtools and xorriso and remove genisoimage as 
dependencies")

Therefore, this change enables bits avocado test on gitlab.
At the same time, the bits cleanup code has also been made more robust with
this change.

Signed-off-by: Ani Sinha 
---
 tests/avocado/acpi-bits.py | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

Pipeline is green:
https://gitlab.com/anisinha/qemu/-/pipelines/86967
bios bits tests passing:
https://gitlab.com/anisinha/qemu/-/jobs/4298116787#L189

diff --git a/tests/avocado/acpi-bits.py b/tests/avocado/acpi-bits.py
index 14038fa3c4..3ed286dcbd 100644
--- a/tests/avocado/acpi-bits.py
+++ b/tests/avocado/acpi-bits.py
@@ -123,9 +123,9 @@ def base_args(self):
 """return the base argument to QEMU binary"""
 return self._base_args
 
-@skipIf(not supported_platform() or missing_deps() or os.getenv('GITLAB_CI'),
-'incorrect platform or dependencies (%s) not installed ' \
-'or running on GitLab' % ','.join(deps))
+@skipIf(not supported_platform() or missing_deps(),
+'unsupported platform or dependencies (%s) not installed' \
+% ','.join(deps))
 class AcpiBitsTest(QemuBaseTest): #pylint: disable=too-many-instance-attributes
 """
 ACPI and SMBIOS tests using biosbits.
@@ -356,7 +356,7 @@ def tearDown(self):
 """
 if self._vm:
 self.assertFalse(not self._vm.is_running)
-if not os.getenv('BITS_DEBUG'):
+if not os.getenv('BITS_DEBUG') and self._workDir:
 self.logger.info('removing the work directory %s', self._workDir)
 shutil.rmtree(self._workDir)
 else:
-- 
2.39.1

Re: [PATCH] async: Suppress GCC13 false positive in aio_bh_poll()

2023-05-16 Thread Michael Tokarev


17.05.2023 09:35, Thomas Huth wrote:
..


I think this should also go into the next stable release (now on CC:), we're 
already getting bug reports about this:

  https://gitlab.com/qemu-project/qemu/-/issues/1655


Yes, I already picked that one up after noticing win32/win64 CI build failures.

Thank you for pointing this out!

/mjt

Re: [PATCH] xen/pt: fix igd passthrough for pc machine with xen accelerator

2023-05-16 Thread Michael Tokarev


08.02.2023 05:03, Chuck Zmudzinski wrote:

Commit 998250e97661 ("xen, gfx passthrough: register host bridge specific
to passthrough") uses the igd-passthrough-i440FX pci host device with
the xenfv machine type and igd-passthru=on, but using it for the pc
machine type, xen accelerator, and igd-passtru=on was omitted from that
commit.

The igd-passthru-i440FX pci host device is also needed for guests
configured with the pc machine type, the xen accelerator, and
igd-passthru=on. Specifically, tests show that not using the igd-specific
pci host device with the Intel igd passed through to the guest results
in slower startup performance and reduced resolution of the display
during startup. This patch fixes this issue.

To simplify the logic that is needed to support both the --enable-xen
and the --disable-xen configure options, introduce the boolean symbol
pc_xen_igd_gfx_pt_enabled() whose value is set appropriately in the
sysemu/xen.h header file as the test to determine whether or not
to use the igd-passthrough-i440FX pci host device instead of the
normal i440FX pci host device.

Fixes: 998250e97661 ("xen, gfx passthrough: register host bridge specific to 
passthrough")
Signed-off-by: Chuck Zmudzinski 


Has this change been forgotten?  Is it not needed anymore?

Thanks,

/mjt

Re: [PATCH] async: Suppress GCC13 false positive in aio_bh_poll()

2023-05-16 Thread Thomas Huth


On 20/04/2023 22.29, Cédric Le Goater wrote:

From: Cédric Le Goater 

GCC13 reports an error :

../util/async.c: In function ‘aio_bh_poll’:
include/qemu/queue.h:303:22: error: storing the address of local variable 
‘slice’ in ‘*ctx.bh_slice_list.sqh_last’ [-Werror=dangling-pointer=]
   303 | (head)->sqh_last = &(elm)->field.sqe_next; 
 \
   | ~^~~~
../util/async.c:169:5: note: in expansion of macro ‘QSIMPLEQ_INSERT_TAIL’
   169 | QSIMPLEQ_INSERT_TAIL(&ctx->bh_slice_list, &slice, next);
   | ^~~~
../util/async.c:161:17: note: ‘slice’ declared here
   161 | BHListSlice slice;
   | ^
../util/async.c:161:17: note: ‘ctx’ declared here

But the local variable 'slice' is removed from the global context list
in following loop of the same routine. Add a pragma to silent GCC.


I think this should also go into the next stable release (now on CC:), we're 
already getting bug reports about this:


 https://gitlab.com/qemu-project/qemu/-/issues/1655

 Thomas

Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa

2023-05-16 Thread Eugenio Perez Martin

On Wed, May 17, 2023 at 5:59 AM Jason Wang  wrote:
>
> On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez  wrote:
> >
> > Evaluating it at start time instead of initialization time may make the
> > guest capable of dynamically adding or removing migration blockers.
> >
> > Also, moving to initialization reduces the number of ioctls in the
> > migration, reducing failure possibilities.
> >
> > As a drawback we need to check for CVQ isolation twice: one time with no
> > MQ negotiated and another one acking it, as long as the device supports
> > it.  This is because Vring ASID / group management is based on vq
> > indexes, but we don't know the index of CVQ before negotiating MQ.
> >
> > Signed-off-by: Eugenio Pérez 
> > ---
> > v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> > v3: Only record cvq_isolated, true if the device have cvq isolated in
> > both !MQ and MQ configurations.
> > ---
> >  net/vhost-vdpa.c | 178 +++
> >  1 file changed, 135 insertions(+), 43 deletions(-)
> >
> > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> > index 3fb833fe76..29054b77a9 100644
> > --- a/net/vhost-vdpa.c
> > +++ b/net/vhost-vdpa.c
> > @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
> >
> >  /* The device always have SVQ enabled */
> >  bool always_svq;
> > +
> > +/* The device can isolate CVQ in its own ASID */
> > +bool cvq_isolated;
> > +
> >  bool started;
> >  } VhostVDPAState;
> >
> > @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
> >  .check_peer_type = vhost_vdpa_check_peer_type,
> >  };
> >
> > -/**
> > - * Get vring virtqueue group
> > - *
> > - * @device_fd  vdpa device fd
> > - * @vq_index   Virtqueue index
> > - *
> > - * Return -errno in case of error, or vq group if success.
> > - */
> > -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> > +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> > +  Error **errp)
> >  {
> >  struct vhost_vring_state state = {
> >  .index = vq_index,
> > @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int 
> > device_fd, unsigned vq_index)
> >
> >  if (unlikely(r < 0)) {
> >  r = -errno;
> > -error_report("Cannot get VQ %u group: %s", vq_index,
> > - g_strerror(errno));
> > +error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
> >  return r;
> >  }
> >
> > @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
> >  {
> >  VhostVDPAState *s, *s0;
> >  struct vhost_vdpa *v;
> > -uint64_t backend_features;
> >  int64_t cvq_group;
> > -int cvq_index, r;
> > +int r;
> > +Error *err = NULL;
> >
> >  assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
> >
> > @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState 
> > *nc)
> >  /*
> >   * If we early return in these cases SVQ will not be enabled. The 
> > migration
> >   * will be blocked as long as vhost-vdpa backends will not offer 
> > _F_LOG.
> > - *
> > - * Calling VHOST_GET_BACKEND_FEATURES as they are not available in 
> > v->dev
> > - * yet.
> >   */
> > -r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> > -if (unlikely(r < 0)) {
> > -error_report("Cannot get vdpa backend_features: %s(%d)",
> > -g_strerror(errno), errno);
> > -return -1;
> > +if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > +return 0;
> >  }
> > -if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> > -!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> > +
> > +if (!s->cvq_isolated) {
> >  return 0;
> >  }
> >
> > -/*
> > - * Check if all the virtqueues of the virtio device are in a different 
> > vq
> > - * than the last vq. VQ group of last group passed in cvq_group.
> > - */
> > -cvq_index = v->dev->vq_index_end - 1;
> > -cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> > +cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> > +   v->dev->vq_index_end - 1,
> > +   &err);
> >  if (unlikely(cvq_group < 0)) {
> > +error_report_err(err);
> >  return cvq_group;
> >  }
> > -for (int i = 0; i < cvq_index; ++i) {
> > -int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> > -
> > -if (unlikely(group < 0)) {
> > -return group;
> > -}
> > -
> > -if (group == cvq_group) {
> > -return 0;
> > -}
> > -}
> >
> >  r = vhost_vdpa_set_address_space_id(v, cvq_group, 
> > VHOST_VDPA_NET_CVQ_ASID);
> >  if (unlikely(r < 0)) {
> > @@ -799,6 +776,111 @@ static const VhostShadowVirtqueue

Re: [PULL 0/9] Linux user for 8.1 patches

2023-05-16 Thread Laurent Vivier


Le 16/05/2023 à 19:08, Richard Henderson a écrit :

On 5/16/23 05:48, Laurent Vivier wrote:

The following changes since commit 7c18f2d663521f1b31b821a13358ce38075eaf7d:

   Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging (2023-04-29 23:07:17 
+0100)


are available in the Git repository at:

   https://github.com/vivier/qemu.git tags/linux-user-for-8.1-pull-request

for you to fetch changes up to c490496e85047d516b31f93ea0e14819e0ab5cf5:

   linux-user: fix getgroups/setgroups allocations (2023-05-16 12:48:09 +0200)


linux-user pull request 20230512-v3

add open_tree(), move_mount()
add /proc/cpuinfo for riscv
fixes and cleanup


The new test still fails.

https://gitlab.com/qemu-project/qemu/-/jobs/4295127220#L4423

cpuinfo: /builds/qemu-project/qemu/tests/tcg/riscv64/cpuinfo.c:20: main: Assertion `strcmp(buffer, 
"isa\t\t: rv64imafdc_zicsr_zifencei\n") == 0' failed.




I'm going to remove the test as it's not clear why it fails...

Thanks,
Laurent

Re: [PATCH] qapi/parser: Fix type hints

2023-05-16 Thread Markus Armbruster

Richard Henderson  writes:

> On 5/15/23 22:22, Markus Armbruster wrote:
>>> https://gitlab.com/qemu-project/qemu/-/jobs/4289613692#L574
>>>
>>>File "/builds/qemu-project/qemu/scripts/qapi/parser.py", line 566, in 
>>> QAPIDoc
>>>  def _match_at_name_colon(string: str) -> Optional[re.Match[str]]:
>>> TypeError: 'type' object is not subscriptable
>> 
>> Life's too short for wrestling with such pigs.  Unless John has better
>> ideas, I'll *remove* these return type annotations.  Maybe these pigs
>> will behave after John's Python venv work lands.
>
> That is exactly the idea that I had as well.
>
> r~

Sent:

Subject: [PATCH] qapi/parser: Drop two bad type hints for now
Message-Id: <20230517061600.1782455-1-arm...@redhat.com>

Re: [PATCH v3 0/5] Move ASID test to vhost-vdpa net initialization

2023-05-16 Thread Lei Yang

QE tested this series with sanity testing on the vdpa_sim device,
everything are works fine and there is no any new regression problems.

Tested-by: Lei Yang 



On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez  wrote:
>
> QEMU v8.0 is able to switch dynamically between vhost-vdpa passthrough
> and SVQ mode as long as the net device does not have CVQ.  The net device
> state followed (and migrated) by CVQ requires special care.
>
> A pre-requisite to add CVQ to that framework is to determine if devices with
> CVQ are migratable or not at initialization time.  The solution to it is to
> always shadow only CVQ, and vq groups and ASID are used for that.
>
> However, current qemu version only checks ASID at device start (as "driver set
> DRIVER_OK status bit"), not at device initialization.  A check at
> initialization time is required.  Otherwise, the guest would be able to set
> and remove migration blockers at will [1].
>
> This series is a requisite for migration of vhost-vdpa net devices with CVQ.
> However it already makes sense by its own, as it reduces the number of ioctls
> at migration time, decreasing the error paths there.
>
> [1] 
> https://lore.kernel.org/qemu-devel/2616f0cd-f9e8-d183-ea78-db1be4825...@redhat.com/
> ---
> v3:
> * Only record cvq_isolated, true if the device have cvq isolated in both !MQ
> * and MQ configurations.
> * Drop the cache of cvq group, it can be done on top
>
> v2:
> * Take out the reset of the device from vhost_vdpa_cvq_is_isolated
>   (reported by Lei Yang).
> * Expand patch messages by Stefano G. questions.
>
> Eugenio Pérez (5):
>   vdpa: Remove status in reset tracing
>   vdpa: add vhost_vdpa_reset_status_fd
>   vdpa: add vhost_vdpa_set_dev_features_fd
>   vdpa: return errno in vhost_vdpa_get_vring_group error
>   vdpa: move CVQ isolation check to net_init_vhost_vdpa
>
>  include/hw/virtio/vhost-vdpa.h |   2 +
>  hw/virtio/vhost-vdpa.c |  78 ++-
>  net/vhost-vdpa.c   | 171 ++---
>  hw/virtio/trace-events |   2 +-
>  4 files changed, 192 insertions(+), 61 deletions(-)
>
> --
> 2.31.1
>
>

[PATCH] qapi/parser: Drop two bad type hints for now

2023-05-16 Thread Markus Armbruster

Two type hints fail centos-stream-8-x86_64 CI.  They are actually
broken.  Changing them to Optional[re.Match[str]] fixes them locally
for me, but then CI fails differently.  Drop them for now.

Fixes: 3e32dca3f0d1 (qapi: Rewrite parsing of doc comment section symbols and 
tags)
Signed-off-by: Markus Armbruster 
---
 scripts/qapi/parser.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/qapi/parser.py b/scripts/qapi/parser.py
index 4923a59d60..1ff334e6a8 100644
--- a/scripts/qapi/parser.py
+++ b/scripts/qapi/parser.py
@@ -563,11 +563,11 @@ def end_comment(self) -> None:
 self._switch_section(QAPIDoc.NullSection(self._parser))
 
 @staticmethod
-def _match_at_name_colon(string: str) -> re.Match:
+def _match_at_name_colon(string: str):
 return re.match(r'@([^:]*): *', string)
 
 @staticmethod
-def _match_section_tag(string: str) -> re.Match:
+def _match_section_tag(string: str):
 return re.match(r'(Returns|Since|Notes?|Examples?|TODO): *', string)
 
 def _append_body_line(self, line: str) -> None:
-- 
2.39.2

Re: [PATCH v2] vfio/pci: Fix a use-after-free issue

2023-05-16 Thread Cédric Le Goater


On 5/17/23 04:46, Zhenzhong Duan wrote:

vbasedev->name is freed wrongly which leads to garbage VFIO trace log.
Fix it by allocating a dup of vbasedev->name and then free the dup.

Fixes: 2dca1b37a7 ("vfio/pci: add support for VF token")
Suggested-by: Alex Williamson 
Signed-off-by: Zhenzhong Duan 


Reviewed-by: Cédric Le Goater 

Thanks,

C.


---
v2: "toke" -> "token", Cedric
 Update with Alex suggested change

  hw/vfio/pci.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index bf27a3990564..73874a94de12 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2994,7 +2994,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
  qemu_uuid_unparse(&vdev->vf_token, uuid);
  name = g_strdup_printf("%s vf_token=%s", vbasedev->name, uuid);
  } else {
-name = vbasedev->name;
+name = g_strdup(vbasedev->name);
  }
  
  ret = vfio_get_device(group, name, vbasedev, errp);

Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd

2023-05-16 Thread Jason Wang

On Wed, May 17, 2023 at 1:46 PM Eugenio Perez Martin
 wrote:
>
> On Wed, May 17, 2023 at 5:14 AM Jason Wang  wrote:
> >
> > On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez  wrote:
> > >
> > > This allows to reset a vhost-vdpa device from external subsystems like
> > > vhost-net, since it does not have any struct vhost_dev by the time we
> > > need to use it.
> > >
> > > It is used in subsequent patches to negotiate features
> > > and probe for CVQ ASID isolation.
> > >
> > > Reviewed-by: Stefano Garzarella 
> > > Signed-off-by: Eugenio Pérez 
> > > ---
> > >  include/hw/virtio/vhost-vdpa.h |  1 +
> > >  hw/virtio/vhost-vdpa.c | 58 +++---
> > >  2 files changed, 41 insertions(+), 18 deletions(-)
> > >
> > > diff --git a/include/hw/virtio/vhost-vdpa.h 
> > > b/include/hw/virtio/vhost-vdpa.h
> > > index c278a2a8de..28de7da91e 100644
> > > --- a/include/hw/virtio/vhost-vdpa.h
> > > +++ b/include/hw/virtio/vhost-vdpa.h
> > > @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
> > >  VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> > >  } VhostVDPA;
> > >
> > > +void vhost_vdpa_reset_status_fd(int fd);
> > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
> > > *iova_range);
> > >
> > >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > > index bbabea18f3..7a2053b8d9 100644
> > > --- a/hw/virtio/vhost-vdpa.c
> > > +++ b/hw/virtio/vhost-vdpa.c
> > > @@ -335,38 +335,45 @@ static const MemoryListener 
> > > vhost_vdpa_memory_listener = {
> > >  .region_del = vhost_vdpa_listener_region_del,
> > >  };
> > >
> > > -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int 
> > > request,
> > > - void *arg)
> > > +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
> > >  {
> > >  struct vhost_vdpa *v = dev->opaque;
> > > -int fd = v->device_fd;
> > > -int ret;
> > >
> > >  assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > > +return v->device_fd;
> > > +}
> >
> > Nit: unless the vhost_dev structure is opaque to the upper layer, I
> > don't see any advantage for having a dedicated indirect helper to get
> > device_fd.
> >
>
> The purpose was to not duplicate the assert, but sure it's not mandatory.

Ok, but I think for new codes, we'd better avoid assert as much as possible.

>
> > > +
> > > +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void 
> > > *arg)
> > > +{
> > > +int ret = ioctl(fd, request, arg);
> > >
> > > -ret = ioctl(fd, request, arg);
> > >  return ret < 0 ? -errno : ret;
> > >  }
> > >
> > > -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int 
> > > request,
> > > +   void *arg)
> > > +{
> > > +return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> > > +}
> > > +
> > > +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
> > >  {
> > >  uint8_t s;
> > >  int ret;
> > >
> > > -trace_vhost_vdpa_add_status(dev, status);
> > > -ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > >  if (ret < 0) {
> > >  return ret;
> > >  }
> > >
> > >  s |= status;
> > >
> > > -ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> > > +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
> > >  if (ret < 0) {
> > >  return ret;
> > >  }
> > >
> > > -ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > > +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> > >  if (ret < 0) {
> > >  return ret;
> > >  }
> > > @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev 
> > > *dev, uint8_t status)
> > >  return 0;
> > >  }
> > >
> > > +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > > +{
> > > +trace_vhost_vdpa_add_status(dev, status);
> > > +return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> > > +}
> > > +
> > >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
> > > *iova_range)
> > >  {
> > >  int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > > @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct 
> > > vhost_dev *dev,
> > >  return ret;
> > >  }
> > >
> > > +static int vhost_vdpa_reset_device_fd(int fd)
> > > +{
> > > +uint8_t status = 0;
> > > +
> > > +return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> > > +}
> > > +
> > >  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> > >  {
> > >  struct vhost_vdpa *v = dev->opaque;
> > > -int ret;
> > > -uint8_t status = 0;
> > >
> > > -ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > > -trace_vhost_vdpa_reset_device(d

Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd

2023-05-16 Thread Eugenio Perez Martin

On Wed, May 17, 2023 at 5:14 AM Jason Wang  wrote:
>
> On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez  wrote:
> >
> > This allows to reset a vhost-vdpa device from external subsystems like
> > vhost-net, since it does not have any struct vhost_dev by the time we
> > need to use it.
> >
> > It is used in subsequent patches to negotiate features
> > and probe for CVQ ASID isolation.
> >
> > Reviewed-by: Stefano Garzarella 
> > Signed-off-by: Eugenio Pérez 
> > ---
> >  include/hw/virtio/vhost-vdpa.h |  1 +
> >  hw/virtio/vhost-vdpa.c | 58 +++---
> >  2 files changed, 41 insertions(+), 18 deletions(-)
> >
> > diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> > index c278a2a8de..28de7da91e 100644
> > --- a/include/hw/virtio/vhost-vdpa.h
> > +++ b/include/hw/virtio/vhost-vdpa.h
> > @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
> >  VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
> >  } VhostVDPA;
> >
> > +void vhost_vdpa_reset_status_fd(int fd);
> >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
> > *iova_range);
> >
> >  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> > index bbabea18f3..7a2053b8d9 100644
> > --- a/hw/virtio/vhost-vdpa.c
> > +++ b/hw/virtio/vhost-vdpa.c
> > @@ -335,38 +335,45 @@ static const MemoryListener 
> > vhost_vdpa_memory_listener = {
> >  .region_del = vhost_vdpa_listener_region_del,
> >  };
> >
> > -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int 
> > request,
> > - void *arg)
> > +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
> >  {
> >  struct vhost_vdpa *v = dev->opaque;
> > -int fd = v->device_fd;
> > -int ret;
> >
> >  assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> > +return v->device_fd;
> > +}
>
> Nit: unless the vhost_dev structure is opaque to the upper layer, I
> don't see any advantage for having a dedicated indirect helper to get
> device_fd.
>

The purpose was to not duplicate the assert, but sure it's not mandatory.

> > +
> > +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
> > +{
> > +int ret = ioctl(fd, request, arg);
> >
> > -ret = ioctl(fd, request, arg);
> >  return ret < 0 ? -errno : ret;
> >  }
> >
> > -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int 
> > request,
> > +   void *arg)
> > +{
> > +return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> > +}
> > +
> > +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
> >  {
> >  uint8_t s;
> >  int ret;
> >
> > -trace_vhost_vdpa_add_status(dev, status);
> > -ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> >  if (ret < 0) {
> >  return ret;
> >  }
> >
> >  s |= status;
> >
> > -ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> > +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
> >  if (ret < 0) {
> >  return ret;
> >  }
> >
> > -ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> > +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
> >  if (ret < 0) {
> >  return ret;
> >  }
> > @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev 
> > *dev, uint8_t status)
> >  return 0;
> >  }
> >
> > +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> > +{
> > +trace_vhost_vdpa_add_status(dev, status);
> > +return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> > +}
> > +
> >  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
> > *iova_range)
> >  {
> >  int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> > @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev 
> > *dev,
> >  return ret;
> >  }
> >
> > +static int vhost_vdpa_reset_device_fd(int fd)
> > +{
> > +uint8_t status = 0;
> > +
> > +return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> > +}
> > +
> >  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
> >  {
> >  struct vhost_vdpa *v = dev->opaque;
> > -int ret;
> > -uint8_t status = 0;
> >
> > -ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> > -trace_vhost_vdpa_reset_device(dev);
> >  v->suspended = false;
> > -return ret;
> > +trace_vhost_vdpa_reset_device(dev);
> > +return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
> >  }
> >
> >  static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> > @@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev 
> > *dev, bool started)
> >  return 0;
> >  }
> >
> > +void vhost_vdpa_reset_status_fd(int fd)

Re: [PATCH v2 2/2] vdpa: send CVQ state load commands in parallel

2023-05-16 Thread Jason Wang

On Sat, May 6, 2023 at 10:07 PM Hawkins Jiawei  wrote:
>
> This patch introduces the vhost_vdpa_net_cvq_add() and
> refactors the vhost_vdpa_net_load*(), so that QEMU can
> send CVQ state load commands in parallel.
>
> To be more specific, this patch introduces vhost_vdpa_net_cvq_add()
> to add SVQ control commands to SVQ and kick the device,
> but does not poll the device used buffers. QEMU will not
> poll and check the device used buffers in vhost_vdpa_net_load()
> until all CVQ state load commands have been sent to the device.
>
> What's more, in order to avoid buffer overwriting caused by
> using `svq->cvq_cmd_out_buffer` and `svq->status` as the
> buffer for all CVQ state load commands when sending
> CVQ state load commands in parallel, this patch introduces
> `out_cursor` and `in_cursor` in vhost_vdpa_net_load(),
> pointing to the available buffer for in descriptor and
> out descriptor, so that different CVQ state load commands can
> use their unique buffer.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1578
> Signed-off-by: Hawkins Jiawei 
> ---
>  net/vhost-vdpa.c | 152 +--
>  1 file changed, 120 insertions(+), 32 deletions(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 10804c7200..14e31ca5c5 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -590,6 +590,44 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
>  vhost_vdpa_net_client_stop(nc);
>  }
>
> +/**
> + * vhost_vdpa_net_cvq_add() adds SVQ control commands to SVQ,
> + * kicks the device but does not poll the device used buffers.
> + *
> + * Return the number of elements added to SVQ if success.
> + */
> +static int vhost_vdpa_net_cvq_add(VhostVDPAState *s,
> +void **out_cursor, size_t out_len,

Can we track things like cursors in e.g VhostVDPAState ?

> +virtio_net_ctrl_ack **in_cursor, size_t 
> in_len)
> +{
> +/* Buffers for the device */
> +const struct iovec out = {
> +.iov_base = *out_cursor,
> +.iov_len = out_len,
> +};
> +const struct iovec in = {
> +.iov_base = *in_cursor,
> +.iov_len = sizeof(virtio_net_ctrl_ack),
> +};
> +VhostShadowVirtqueue *svq = g_ptr_array_index(s->vhost_vdpa.shadow_vqs, 
> 0);
> +int r;
> +
> +r = vhost_svq_add(svq, &out, 1, &in, 1, NULL);
> +if (unlikely(r != 0)) {
> +if (unlikely(r == -ENOSPC)) {
> +qemu_log_mask(LOG_GUEST_ERROR, "%s: No space on device queue\n",
> +  __func__);
> +}
> +return r;
> +}
> +
> +/* Update the cursor */
> +*out_cursor += out_len;
> +*in_cursor += 1;
> +
> +return 1;
> +}
> +
>  /**
>   * vhost_vdpa_net_cvq_add_and_wait() adds SVQ control commands to SVQ,
>   * kicks the device and polls the device used buffers.
> @@ -628,69 +666,82 @@ static ssize_t 
> vhost_vdpa_net_cvq_add_and_wait(VhostVDPAState *s,
>  return vhost_svq_poll(svq);
>  }
>
> -static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, uint8_t class,
> -   uint8_t cmd, const void *data,
> -   size_t data_size)
> +
> +/**
> + * vhost_vdpa_net_load_cmd() restores the NIC state through SVQ.
> + *
> + * Return the number of elements added to SVQ if success.
> + */
> +static int vhost_vdpa_net_load_cmd(VhostVDPAState *s,
> +void **out_cursor, uint8_t class, uint8_t 
> cmd,
> +const void *data, size_t data_size,
> +virtio_net_ctrl_ack **in_cursor)
>  {
>  const struct virtio_net_ctrl_hdr ctrl = {
>  .class = class,
>  .cmd = cmd,
>  };
>
> -assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl));
> +assert(sizeof(ctrl) < vhost_vdpa_net_cvq_cmd_page_len() -
> +  (*out_cursor - s->cvq_cmd_out_buffer));
> +assert(data_size < vhost_vdpa_net_cvq_cmd_page_len() - sizeof(ctrl) -
> +   (*out_cursor - s->cvq_cmd_out_buffer));
>
> -memcpy(s->cvq_cmd_out_buffer, &ctrl, sizeof(ctrl));
> -memcpy(s->cvq_cmd_out_buffer + sizeof(ctrl), data, data_size);
> +memcpy(*out_cursor, &ctrl, sizeof(ctrl));
> +memcpy(*out_cursor + sizeof(ctrl), data, data_size);
>
> -return vhost_vdpa_net_cvq_add_and_wait(s, sizeof(ctrl) + data_size,
> -  sizeof(virtio_net_ctrl_ack));
> +return vhost_vdpa_net_cvq_add(s, out_cursor, sizeof(ctrl) + data_size,
> +  in_cursor, sizeof(virtio_net_ctrl_ack));
>  }
>
> -static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n)
> +/**
> + * vhost_vdpa_net_load_mac() restores the NIC mac through SVQ.
> + *
> + * Return the number of elements added to SVQ if success.
> + */
> +static int vhost_vdpa_net_load_mac(VhostVDPAState *s, const VirtIONet *n,
> +

Re: [PATCH v2] hw/riscv: virt: Assume M-mode FW in pflash0 only when "-bios none"

2023-05-16 Thread Sunil V L

On Wed, May 17, 2023 at 02:57:12PM +1000, Alistair Francis wrote:
> On Mon, May 8, 2023 at 9:45 PM Andrea Bolognani  wrote:
> >
> > On Mon, May 08, 2023 at 04:53:46PM +0530, Sunil V L wrote:
> > > On Mon, May 08, 2023 at 03:00:02AM -0700, Andrea Bolognani wrote:
> > > > I think that it's more important to align with other architectures.
> 
> That's true, ideally we want to match what people are already doing.
> 
> > > >
> > > > The number of people currently running edk2 on RISC-V is probably
> > > > vanishingly small, and in my opinion requiring them to tweak their
> > > > command lines a bit is a fair price to pay to avoid having to carry a
> > > > subtle difference between architectures for years to come.
> > >
> > > It is not just tweaking the command line. The current EDK2 will not work
> > > anymore if code is moved to plfash 0 since EDK2 assumed its entry point
> > > is in pflash1. I agree there may not be too many users but if we have
> > > to align with other archs, there will be combinations of qemu and
> > > edk2 versions which won't work.
> >
> > Right.
> >
> > > > With that in mind, my preference would be to go back to v1.
> > >
> > > Thanks!. If this is the preference,  we can request people to use proper
> > > versions of EDK2 with different qemu versions.
> >
> > Yeah, in the (not so) long run this will just not matter, as the
> > versions of edk2 and QEMU available to people will all implement the
> > new behavior. Better to optimize for the long future ahead of us
> > rather than causing ongoing pain for the sake of the few users of a
> > work-in-progress board.
> >
> > > > Taking a step back, what is even the use case for having M-mode code
> > > > in pflash0? If you want to use an M-mode firmware, can't you just use
> > > > -bios instead? In other words, can we change the behavior so that
> > > > pflash being present always mean loading S-mode firmware off it?
> 
> It was originally added to support Oreboot (the Rust version of
> Coreboot). The idea was that Oreboot (ROM) would be in flash and then
> go from there.
> 
> It also applies to other ROM code that a user might want to test that
> runs before OpenSBI.
> 
> > >
> > > TBH, I don't know. I am sure Alistair would know since it was added in
> > > https://github.com/qemu/qemu/commit/1c20d3ff6004b600336c52cbef9f134fad3ccd94
> > > I don't think opensbi can be launched from pflash. So, it may be some
> > > other use case which I am now aware of.
> > >
> > > I will be happy if this can be avoided by using -bios.
> >
> > The actual commit would be [1], from late 2019. Things might have
> > changed in the intervening ~3.5 years. Let's wait to hear from
> > Alistair :)
> 
> Overall for this patch I don't feel strongly about following what ARM
> does or continuing with what we already have. I would prefer to match
> other archs if we can though.
> 
> Also, either way we should update the documentation in
> docs/system/riscv/virt.rst to describe what happens.
> 
Thanks! Alistair. My reminder mail was sent just before seeing this
response. Sorry about that.

Let me go back to v1 and also update the virt.rst and send v3.

Thanks!
Sunil

Re: [PATCH v2] hw/riscv: virt: Assume M-mode FW in pflash0 only when "-bios none"

2023-05-16 Thread Sunil V L

On Mon, May 08, 2023 at 04:44:22AM -0700, Andrea Bolognani wrote:
> On Mon, May 08, 2023 at 04:53:46PM +0530, Sunil V L wrote:
> > On Mon, May 08, 2023 at 03:00:02AM -0700, Andrea Bolognani wrote:
> > > I think that it's more important to align with other architectures.
> > >
> > > The number of people currently running edk2 on RISC-V is probably
> > > vanishingly small, and in my opinion requiring them to tweak their
> > > command lines a bit is a fair price to pay to avoid having to carry a
> > > subtle difference between architectures for years to come.
> >
> > It is not just tweaking the command line. The current EDK2 will not work
> > anymore if code is moved to plfash 0 since EDK2 assumed its entry point
> > is in pflash1. I agree there may not be too many users but if we have
> > to align with other archs, there will be combinations of qemu and
> > edk2 versions which won't work.
> 
> Right.
> 
> > > With that in mind, my preference would be to go back to v1.
> >
> > Thanks!. If this is the preference,  we can request people to use proper
> > versions of EDK2 with different qemu versions.
> 
> Yeah, in the (not so) long run this will just not matter, as the
> versions of edk2 and QEMU available to people will all implement the
> new behavior. Better to optimize for the long future ahead of us
> rather than causing ongoing pain for the sake of the few users of a
> work-in-progress board.
> 
> > > Taking a step back, what is even the use case for having M-mode code
> > > in pflash0? If you want to use an M-mode firmware, can't you just use
> > > -bios instead? In other words, can we change the behavior so that
> > > pflash being present always mean loading S-mode firmware off it?
> >
> > TBH, I don't know. I am sure Alistair would know since it was added in
> > https://github.com/qemu/qemu/commit/1c20d3ff6004b600336c52cbef9f134fad3ccd94
> > I don't think opensbi can be launched from pflash. So, it may be some
> > other use case which I am now aware of.
> >
> > I will be happy if this can be avoided by using -bios.
> 
> The actual commit would be [1], from late 2019. Things might have
> changed in the intervening ~3.5 years. Let's wait to hear from
> Alistair :)
> 
> 
> [1] 
> https://github.com/qemu/qemu/commit/2738b3b555efaf206b814677966e8e3510c64a8a
> -- 
Hi Alistair,

Could you please provide your inputs on whether we can remove this
pflash0 check completely and assume pflash will always have S-mode
payload? 

I realized you responded to similar patch from Yong at [1] which I
missed since qemu-riscv was not copied. My v2 patch is similar to Yong's
patch but the feedback from distro experts is that, it better we align
with other architectures.

Based on your feedback, I will modify the patch and send v3.

[1] - https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg04023.html

Thanks
Sunil

Re: [PATCH v4] hw/riscv: qemu crash when NUMA nodes exceed available CPUs

2023-05-16 Thread Alistair Francis

On Tue, May 16, 2023 at 10:55 AM Yin Wang  wrote:
>
> Command "qemu-system-riscv64 -machine virt
> -m 2G -smp 1 -numa node,mem=1G -numa node,mem=1G"
> would trigger this problem.Backtrace with:
>  #0  0x55b5b1a4 in riscv_numa_get_default_cpu_node_id  at 
> ../hw/riscv/numa.c:211
>  #1  0x558ce510 in machine_numa_finish_cpu_init  at 
> ../hw/core/machine.c:1230
>  #2  0x558ce9d3 in machine_run_board_init  at 
> ../hw/core/machine.c:1346
>  #3  0x55aaedc3 in qemu_init_board  at ../softmmu/vl.c:2513
>  #4  0x55aaf064 in qmp_x_exit_preconfig  at ../softmmu/vl.c:2609
>  #5  0x55ab1916 in qemu_init  at ../softmmu/vl.c:3617
>  #6  0x5585463b in main  at ../softmmu/main.c:47
> This commit fixes the issue by adding parameter checks.
>
> Reviewed-by: Daniel Henrique Barboza 
> Reviewed-by: LIU Zhiwei 
> Reviewed-by: Weiwei Li 
> Signed-off-by: Yin Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/numa.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/hw/riscv/numa.c b/hw/riscv/numa.c
> index 4720102561..e0414d5b1b 100644
> --- a/hw/riscv/numa.c
> +++ b/hw/riscv/numa.c
> @@ -207,6 +207,12 @@ int64_t riscv_numa_get_default_cpu_node_id(const 
> MachineState *ms, int idx)
>  {
>  int64_t nidx = 0;
>
> +if (ms->numa_state->num_nodes > ms->smp.cpus) {
> +error_report("Number of NUMA nodes (%d)"
> + " cannot exceed the number of available CPUs (%d).",
> + ms->numa_state->num_nodes, ms->smp.max_cpus);
> +exit(EXIT_FAILURE);
> +}
>  if (ms->numa_state->num_nodes) {
>  nidx = idx / (ms->smp.cpus / ms->numa_state->num_nodes);
>  if (ms->numa_state->num_nodes <= nidx) {
> --
> 2.34.1
>
>

Re: [PATCH v2] hw/riscv: virt: Assume M-mode FW in pflash0 only when "-bios none"

2023-05-16 Thread Alistair Francis

On Mon, May 8, 2023 at 9:45 PM Andrea Bolognani  wrote:
>
> On Mon, May 08, 2023 at 04:53:46PM +0530, Sunil V L wrote:
> > On Mon, May 08, 2023 at 03:00:02AM -0700, Andrea Bolognani wrote:
> > > I think that it's more important to align with other architectures.

That's true, ideally we want to match what people are already doing.

> > >
> > > The number of people currently running edk2 on RISC-V is probably
> > > vanishingly small, and in my opinion requiring them to tweak their
> > > command lines a bit is a fair price to pay to avoid having to carry a
> > > subtle difference between architectures for years to come.
> >
> > It is not just tweaking the command line. The current EDK2 will not work
> > anymore if code is moved to plfash 0 since EDK2 assumed its entry point
> > is in pflash1. I agree there may not be too many users but if we have
> > to align with other archs, there will be combinations of qemu and
> > edk2 versions which won't work.
>
> Right.
>
> > > With that in mind, my preference would be to go back to v1.
> >
> > Thanks!. If this is the preference,  we can request people to use proper
> > versions of EDK2 with different qemu versions.
>
> Yeah, in the (not so) long run this will just not matter, as the
> versions of edk2 and QEMU available to people will all implement the
> new behavior. Better to optimize for the long future ahead of us
> rather than causing ongoing pain for the sake of the few users of a
> work-in-progress board.
>
> > > Taking a step back, what is even the use case for having M-mode code
> > > in pflash0? If you want to use an M-mode firmware, can't you just use
> > > -bios instead? In other words, can we change the behavior so that
> > > pflash being present always mean loading S-mode firmware off it?

It was originally added to support Oreboot (the Rust version of
Coreboot). The idea was that Oreboot (ROM) would be in flash and then
go from there.

It also applies to other ROM code that a user might want to test that
runs before OpenSBI.

> >
> > TBH, I don't know. I am sure Alistair would know since it was added in
> > https://github.com/qemu/qemu/commit/1c20d3ff6004b600336c52cbef9f134fad3ccd94
> > I don't think opensbi can be launched from pflash. So, it may be some
> > other use case which I am now aware of.
> >
> > I will be happy if this can be avoided by using -bios.
>
> The actual commit would be [1], from late 2019. Things might have
> changed in the intervening ~3.5 years. Let's wait to hear from
> Alistair :)

Overall for this patch I don't feel strongly about following what ARM
does or continuing with what we already have. I would prefer to match
other archs if we can though.

Also, either way we should update the documentation in
docs/system/riscv/virt.rst to describe what happens.

Alistair

>
>
> [1] 
> https://github.com/qemu/qemu/commit/2738b3b555efaf206b814677966e8e3510c64a8a
> --
> Andrea Bolognani / Red Hat / Virtualization
>
>

Re: [PATCH v8 11/11] target/riscv: rework write_misa()

2023-05-16 Thread Alistair Francis

On Fri, May 12, 2023 at 10:42 PM Daniel Henrique Barboza
 wrote:
>
> Alistair,
>
>
> Since this is the only patch that is being contested is it possible to apply 
> all
> the other ones to riscv-to-apply.next?
>
> I'm asking because I'm going to send more code that will affect cpu_init() and
> riscv_cpu_realize() functions, and it would be easier if the cleanups were 
> already
> in 'next'. Otherwise I'll either have to base the new code on top of this 
> pending
> series or I'll base them under riscv-to-apply.next and we'll have to deal with
> conflicts.

Urgh, that's fair.

I just replied to your other thread. Let's throw a guest error print
in here and then we can merge it

Alistair

>
>
> Thanks,
>
> Daniel
>
> On 4/21/23 10:27, Daniel Henrique Barboza wrote:
> > write_misa() must use as much common logic as possible. We want to open
> > code just the bits that are exclusive to the CSR write operation and TCG
> > internals.
> >
> > Our validation is done with riscv_cpu_validate_set_extensions(), but we
> > need a small tweak first. When enabling RVG we're doing:
> >
> >  env->misa_ext |= RVI | RVM | RVA | RVF | RVD;
> >  env->misa_ext_mask = env->misa_ext;
> >
> > This works fine for realize() time but this can potentially overwrite
> > env->misa_ext_mask if we reutilize the function for write_misa().
> >
> > Instead of doing misa_ext_mask = misa_ext, sum up the RVG extensions in
> > misa_ext_mask as well. This won't change realize() time behavior
> > (misa_ext_mask will be == misa_ext) and will ensure that write_misa()
> > won't change misa_ext_mask by accident.
> >
> > After that, rewrite write_misa() to work as follows:
> >
> > - mask the write using misa_ext_mask to avoid enabling unsupported
> >extensions;
> >
> > - suppress RVC if the next insn isn't aligned;
> >
> > - disable RVG if any of RVG dependencies are being disabled by the user;
> >
> > - assign env->misa_ext and run riscv_cpu_validate_set_extensions(). On
> >error, rollback env->misa_ext to its original value;
> >
> > - handle RVF and MSTATUS_FS and continue as usual.
> >
> > Let's keep write_misa() as experimental for now until this logic gains
> > enough mileage.
> >
> > Signed-off-by: Daniel Henrique Barboza 
> > Reviewed-by: Weiwei Li 
> > ---
> >   target/riscv/cpu.c |  4 ++--
> >   target/riscv/cpu.h |  1 +
> >   target/riscv/csr.c | 47 --
> >   3 files changed, 23 insertions(+), 29 deletions(-)
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > index 7d407321aa..4fa720a39d 100644
> > --- a/target/riscv/cpu.c
> > +++ b/target/riscv/cpu.c
> > @@ -944,7 +944,7 @@ static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, 
> > Error **errp)
> >* Check consistency between chosen extensions while setting
> >* cpu->cfg accordingly.
> >*/
> > -static void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
> > +void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
> >   {
> >   CPURISCVState *env = &cpu->env;
> >   Error *local_err = NULL;
> > @@ -960,7 +960,7 @@ static void riscv_cpu_validate_set_extensions(RISCVCPU 
> > *cpu, Error **errp)
> >   cpu->cfg.ext_ifencei = true;
> >
> >   env->misa_ext |= RVI | RVM | RVA | RVF | RVD;
> > -env->misa_ext_mask = env->misa_ext;
> > +env->misa_ext_mask |= RVI | RVM | RVA | RVF | RVD;
> >   }
> >
> >   if (riscv_has_ext(env, RVI) && riscv_has_ext(env, RVE)) {
> > diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> > index 15423585d0..1f39edc687 100644
> > --- a/target/riscv/cpu.h
> > +++ b/target/riscv/cpu.h
> > @@ -548,6 +548,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, 
> > int size,
> >   bool probe, uintptr_t retaddr);
> >   char *riscv_isa_string(RISCVCPU *cpu);
> >   void riscv_cpu_list(void);
> > +void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp);
> >
> >   #define cpu_list riscv_cpu_list
> >   #define cpu_mmu_index riscv_cpu_mmu_index
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > index 4451bd1263..4a3c57ea6f 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -1387,39 +1387,18 @@ static RISCVException read_misa(CPURISCVState *env, 
> > int csrno,
> >   static RISCVException write_misa(CPURISCVState *env, int csrno,
> >target_ulong val)
> >   {
> > +RISCVCPU *cpu = env_archcpu(env);
> > +uint32_t orig_misa_ext = env->misa_ext;
> > +Error *local_err = NULL;
> > +
> >   if (!riscv_cpu_cfg(env)->misa_w) {
> >   /* drop write to misa */
> >   return RISCV_EXCP_NONE;
> >   }
> >
> > -/* 'I' or 'E' must be present */
> > -if (!(val & (RVI | RVE))) {
> > -/* It is not, drop write to misa */
> > -return RISCV_EXCP_NONE;
> > -}
> > -
> > -/* 'E' excludes all other extensions */
> > -if (val & RVE) {
> > -/*
> > - * when we s

Re: [PATCH v8 11/11] target/riscv: rework write_misa()

2023-05-16 Thread Alistair Francis

On Mon, May 8, 2023 at 8:29 PM Daniel Henrique Barboza
 wrote:
>
>
>
> On 5/7/23 20:25, Alistair Francis wrote:
> > On Fri, Apr 21, 2023 at 11:29 PM Daniel Henrique Barboza
> >  wrote:
> >>
> >> write_misa() must use as much common logic as possible. We want to open
> >> code just the bits that are exclusive to the CSR write operation and TCG
> >> internals.
> >>
> >> Our validation is done with riscv_cpu_validate_set_extensions(), but we
> >> need a small tweak first. When enabling RVG we're doing:
> >>
> >>  env->misa_ext |= RVI | RVM | RVA | RVF | RVD;
> >>  env->misa_ext_mask = env->misa_ext;
> >>
> >> This works fine for realize() time but this can potentially overwrite
> >> env->misa_ext_mask if we reutilize the function for write_misa().
> >>
> >> Instead of doing misa_ext_mask = misa_ext, sum up the RVG extensions in
> >> misa_ext_mask as well. This won't change realize() time behavior
> >> (misa_ext_mask will be == misa_ext) and will ensure that write_misa()
> >> won't change misa_ext_mask by accident.
> >>
> >> After that, rewrite write_misa() to work as follows:
> >>
> >> - mask the write using misa_ext_mask to avoid enabling unsupported
> >>extensions;
> >>
> >> - suppress RVC if the next insn isn't aligned;
> >>
> >> - disable RVG if any of RVG dependencies are being disabled by the user;
> >>
> >> - assign env->misa_ext and run riscv_cpu_validate_set_extensions(). On
> >>error, rollback env->misa_ext to its original value;
> >>
> >> - handle RVF and MSTATUS_FS and continue as usual.
> >>
> >> Let's keep write_misa() as experimental for now until this logic gains
> >> enough mileage.
> >>
> >> Signed-off-by: Daniel Henrique Barboza 
> >> Reviewed-by: Weiwei Li 
> >> ---
> >>   target/riscv/cpu.c |  4 ++--
> >>   target/riscv/cpu.h |  1 +
> >>   target/riscv/csr.c | 47 --
> >>   3 files changed, 23 insertions(+), 29 deletions(-)
> >>
> >> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> >> index 7d407321aa..4fa720a39d 100644
> >> --- a/target/riscv/cpu.c
> >> +++ b/target/riscv/cpu.c
> >> @@ -944,7 +944,7 @@ static void riscv_cpu_validate_misa_mxl(RISCVCPU *cpu, 
> >> Error **errp)
> >>* Check consistency between chosen extensions while setting
> >>* cpu->cfg accordingly.
> >>*/
> >> -static void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
> >> +void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
> >>   {
> >>   CPURISCVState *env = &cpu->env;
> >>   Error *local_err = NULL;
> >> @@ -960,7 +960,7 @@ static void riscv_cpu_validate_set_extensions(RISCVCPU 
> >> *cpu, Error **errp)
> >>   cpu->cfg.ext_ifencei = true;
> >>
> >>   env->misa_ext |= RVI | RVM | RVA | RVF | RVD;
> >> -env->misa_ext_mask = env->misa_ext;
> >> +env->misa_ext_mask |= RVI | RVM | RVA | RVF | RVD;
> >>   }
> >>
> >>   if (riscv_has_ext(env, RVI) && riscv_has_ext(env, RVE)) {
> >> diff --git a/target/riscv/cpu.h b/target/riscv/cpu.h
> >> index 15423585d0..1f39edc687 100644
> >> --- a/target/riscv/cpu.h
> >> +++ b/target/riscv/cpu.h
> >> @@ -548,6 +548,7 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, 
> >> int size,
> >>   bool probe, uintptr_t retaddr);
> >>   char *riscv_isa_string(RISCVCPU *cpu);
> >>   void riscv_cpu_list(void);
> >> +void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp);
> >>
> >>   #define cpu_list riscv_cpu_list
> >>   #define cpu_mmu_index riscv_cpu_mmu_index
> >> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> >> index 4451bd1263..4a3c57ea6f 100644
> >> --- a/target/riscv/csr.c
> >> +++ b/target/riscv/csr.c
> >> @@ -1387,39 +1387,18 @@ static RISCVException read_misa(CPURISCVState 
> >> *env, int csrno,
> >>   static RISCVException write_misa(CPURISCVState *env, int csrno,
> >>target_ulong val)
> >>   {
> >> +RISCVCPU *cpu = env_archcpu(env);
> >> +uint32_t orig_misa_ext = env->misa_ext;
> >> +Error *local_err = NULL;
> >> +
> >>   if (!riscv_cpu_cfg(env)->misa_w) {
> >>   /* drop write to misa */
> >>   return RISCV_EXCP_NONE;
> >>   }
> >>
> >> -/* 'I' or 'E' must be present */
> >> -if (!(val & (RVI | RVE))) {
> >> -/* It is not, drop write to misa */
> >> -return RISCV_EXCP_NONE;
> >> -}
> >> -
> >> -/* 'E' excludes all other extensions */
> >> -if (val & RVE) {
> >> -/*
> >> - * when we support 'E' we can do "val = RVE;" however
> >> - * for now we just drop writes if 'E' is present.
> >> - */
> >> -return RISCV_EXCP_NONE;
> >> -}
> >> -
> >> -/*
> >> - * misa.MXL writes are not supported by QEMU.
> >> - * Drop writes to those bits.
> >> - */
> >> -
> >>   /* Mask extensions that are not supported by this hart */
> >>   val &= env->misa_ext_mask;
> >>
> >> -/* 'D' depends on 'F', so clear 'D' if 'F'

Re: [PATCH v2 1/2] vdpa: rename vhost_vdpa_net_cvq_add()

2023-05-16 Thread Jason Wang

On Sat, May 6, 2023 at 10:07 PM Hawkins Jiawei  wrote:
>
> We want to introduce a new version of vhost_vdpa_net_cvq_add() that
> does not poll immediately after forwarding custom buffers
> to the device, so that QEMU can send all the SVQ control commands
> in parallel instead of serialized.
>
> Signed-off-by: Hawkins Jiawei 
> ---
>  net/vhost-vdpa.c | 15 +++
>  1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 99904a0da7..10804c7200 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -590,8 +590,14 @@ static void vhost_vdpa_net_cvq_stop(NetClientState *nc)
>  vhost_vdpa_net_client_stop(nc);
>  }
>
> -static ssize_t vhost_vdpa_net_cvq_add(VhostVDPAState *s, size_t out_len,
> -  size_t in_len)
> +/**
> + * vhost_vdpa_net_cvq_add_and_wait() adds SVQ control commands to SVQ,
> + * kicks the device and polls the device used buffers.
> + *
> + * Return the length written by the device.
> + */
> +static ssize_t vhost_vdpa_net_cvq_add_and_wait(VhostVDPAState *s,

Nit: is it better to use "poll" or "sync" other than wait?

Other than this:

Acked-by: Jason Wang 

Thanks

> +size_t out_len, size_t in_len)
>  {
>  /* Buffers for the device */
>  const struct iovec out = {
> @@ -636,7 +642,7 @@ static ssize_t vhost_vdpa_net_load_cmd(VhostVDPAState *s, 
> uint8_t class,
>  memcpy(s->cvq_cmd_out_buffer, &ctrl, sizeof(ctrl));
>  memcpy(s->cvq_cmd_out_buffer + sizeof(ctrl), data, data_size);
>
> -return vhost_vdpa_net_cvq_add(s, sizeof(ctrl) + data_size,
> +return vhost_vdpa_net_cvq_add_and_wait(s, sizeof(ctrl) + data_size,
>sizeof(virtio_net_ctrl_ack));
>  }
>
> @@ -753,7 +759,8 @@ static int 
> vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq,
>  dev_written = sizeof(status);
>  *s->status = VIRTIO_NET_OK;
>  } else {
> -dev_written = vhost_vdpa_net_cvq_add(s, out.iov_len, sizeof(status));
> +dev_written = vhost_vdpa_net_cvq_add_and_wait(s, out.iov_len,
> +  sizeof(status));
>  if (unlikely(dev_written < 0)) {
>  goto out;
>  }
> --
> 2.25.1
>

bios-tables-test and iasl

2023-05-16 Thread Ani Sinha

So I was working late scratching my head yesterday on why I was not getting my 
ASL diff on mismatched blobs! Turns out there were two things:

(a) iasl was not installed and I completely forgot about it because my old 
setup is gone and the new box is, well new and I did not get to mess around 
with tables up until now. This was easy to figure out.
(b) I recalled that previously all I had to do was install iasl and the binary 
would just produce the diff because it would discover iasl was in the PATH. Now 
it seemed no matter what I did I was not able to get the diff until I rebased 
and it triggered a clean build of qemu tree. Turns out, now we check iasl 
existence from meson and CONFIG_IASL is set from the build time (if not set, 
*iasl is NULL and no diff is generated).

So I wonder if we have made our lives any easier? Why should we need to rebuild 
the entire tree if all we wanted was to debug a test breakage? Sure we could 
always run iasl manually but isn’t it easier to simply run "V=1 make 
check-qtest” and let the test spit it out for us?

A

Re: [PATCH v2] vhost: fix possible wrap in SVQ descriptor ring

2023-05-16 Thread Jason Wang




在 2023/5/9 16:48, Hawkins Jiawei 写道:

QEMU invokes vhost_svq_add() when adding a guest's element
into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots()
to check whether QEMU can add the element into SVQ. If there is
enough space, then QEMU combines some out descriptors and some
in descriptors into one descriptor chain, and adds it into
`svq->vring.desc` by vhost_svq_vring_write_descs().

Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx`
in vhost_svq_available_slots() returns the number of occupied elements,
or the number of descriptor chains, instead of the number of occupied
descriptors, which may cause wrapping in SVQ descriptor ring.

Here is an example. In vhost_handle_guest_kick(), QEMU forwards
as many available buffers to device by virtqueue_pop() and
vhost_svq_add_element(). virtqueue_pop() returns a guest's element,
and then this element is added into SVQ by vhost_svq_add_element(),
a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and
vhost_svq_add_element() `svq->vring.num` times,
vhost_svq_available_slots() thinks QEMU just ran out of slots and
everything should work fine. But in fact, virtqueue_pop() returns
`svq->vring.num` elements or descriptor chains, more than
`svq->vring.num` descriptors due to guest memory fragmentation,
and this causes wrapping in SVQ descriptor ring.

This bug is valid even before marking the descriptors used.
If the guest memory is fragmented, SVQ must add chains
so it can try to add more descriptors than possible.

This patch solves it by adding `num_free` field in
VhostShadowVirtqueue structure and updating this field
in vhost_svq_add() and vhost_svq_get_buf(), to record
the number of free descriptors.

Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding")
Signed-off-by: Hawkins Jiawei 
Acked-by: Eugenio Pérez 



Acked-by: Jason Wang 

Thanks



---
v2:
   - update the commit message
   - remove the unnecessary comment
   - add the Acked-by tag

v1: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg01727.html

  hw/virtio/vhost-shadow-virtqueue.c | 5 -
  hw/virtio/vhost-shadow-virtqueue.h | 3 +++
  2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 8361e70d1b..bd7c12b6d3 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -68,7 +68,7 @@ bool vhost_svq_valid_features(uint64_t features, Error **errp)
   */
  static uint16_t vhost_svq_available_slots(const VhostShadowVirtqueue *svq)
  {
-return svq->vring.num - (svq->shadow_avail_idx - svq->shadow_used_idx);
+return svq->num_free;
  }
  
  /**

@@ -263,6 +263,7 @@ int vhost_svq_add(VhostShadowVirtqueue *svq, const struct 
iovec *out_sg,
  return -EINVAL;
  }
  
+svq->num_free -= ndescs;

  svq->desc_state[qemu_head].elem = elem;
  svq->desc_state[qemu_head].ndescs = ndescs;
  vhost_svq_kick(svq);
@@ -449,6 +450,7 @@ static VirtQueueElement 
*vhost_svq_get_buf(VhostShadowVirtqueue *svq,
  last_used_chain = vhost_svq_last_desc_of_chain(svq, num, used_elem.id);
  svq->desc_next[last_used_chain] = svq->free_head;
  svq->free_head = used_elem.id;
+svq->num_free += num;
  
  *len = used_elem.len;

  return g_steal_pointer(&svq->desc_state[used_elem.id].elem);
@@ -659,6 +661,7 @@ void vhost_svq_start(VhostShadowVirtqueue *svq, 
VirtIODevice *vdev,
  svq->iova_tree = iova_tree;
  
  svq->vring.num = virtio_queue_get_num(vdev, virtio_get_queue_index(vq));

+svq->num_free = svq->vring.num;
  driver_size = vhost_svq_driver_area_size(svq);
  device_size = vhost_svq_device_area_size(svq);
  svq->vring.desc = qemu_memalign(qemu_real_host_page_size(), driver_size);
diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 926a4897b1..6efe051a70 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -107,6 +107,9 @@ typedef struct VhostShadowVirtqueue {
  
  /* Next head to consume from the device */

  uint16_t last_used_idx;
+
+/* Size of SVQ vring free descriptors */
+uint16_t num_free;
  } VhostShadowVirtqueue;
  
  bool vhost_svq_valid_features(uint64_t features, Error **errp);

Re: [PATCH v3 5/5] vdpa: move CVQ isolation check to net_init_vhost_vdpa

2023-05-16 Thread Jason Wang

On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez  wrote:
>
> Evaluating it at start time instead of initialization time may make the
> guest capable of dynamically adding or removing migration blockers.
>
> Also, moving to initialization reduces the number of ioctls in the
> migration, reducing failure possibilities.
>
> As a drawback we need to check for CVQ isolation twice: one time with no
> MQ negotiated and another one acking it, as long as the device supports
> it.  This is because Vring ASID / group management is based on vq
> indexes, but we don't know the index of CVQ before negotiating MQ.
>
> Signed-off-by: Eugenio Pérez 
> ---
> v2: Take out the reset of the device from vhost_vdpa_cvq_is_isolated
> v3: Only record cvq_isolated, true if the device have cvq isolated in
> both !MQ and MQ configurations.
> ---
>  net/vhost-vdpa.c | 178 +++
>  1 file changed, 135 insertions(+), 43 deletions(-)
>
> diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c
> index 3fb833fe76..29054b77a9 100644
> --- a/net/vhost-vdpa.c
> +++ b/net/vhost-vdpa.c
> @@ -43,6 +43,10 @@ typedef struct VhostVDPAState {
>
>  /* The device always have SVQ enabled */
>  bool always_svq;
> +
> +/* The device can isolate CVQ in its own ASID */
> +bool cvq_isolated;
> +
>  bool started;
>  } VhostVDPAState;
>
> @@ -362,15 +366,8 @@ static NetClientInfo net_vhost_vdpa_info = {
>  .check_peer_type = vhost_vdpa_check_peer_type,
>  };
>
> -/**
> - * Get vring virtqueue group
> - *
> - * @device_fd  vdpa device fd
> - * @vq_index   Virtqueue index
> - *
> - * Return -errno in case of error, or vq group if success.
> - */
> -static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index)
> +static int64_t vhost_vdpa_get_vring_group(int device_fd, unsigned vq_index,
> +  Error **errp)
>  {
>  struct vhost_vring_state state = {
>  .index = vq_index,
> @@ -379,8 +376,7 @@ static int64_t vhost_vdpa_get_vring_group(int device_fd, 
> unsigned vq_index)
>
>  if (unlikely(r < 0)) {
>  r = -errno;
> -error_report("Cannot get VQ %u group: %s", vq_index,
> - g_strerror(errno));
> +error_setg_errno(errp, errno, "Cannot get VQ %u group", vq_index);
>  return r;
>  }
>
> @@ -480,9 +476,9 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>  {
>  VhostVDPAState *s, *s0;
>  struct vhost_vdpa *v;
> -uint64_t backend_features;
>  int64_t cvq_group;
> -int cvq_index, r;
> +int r;
> +Error *err = NULL;
>
>  assert(nc->info->type == NET_CLIENT_DRIVER_VHOST_VDPA);
>
> @@ -502,41 +498,22 @@ static int vhost_vdpa_net_cvq_start(NetClientState *nc)
>  /*
>   * If we early return in these cases SVQ will not be enabled. The 
> migration
>   * will be blocked as long as vhost-vdpa backends will not offer _F_LOG.
> - *
> - * Calling VHOST_GET_BACKEND_FEATURES as they are not available in v->dev
> - * yet.
>   */
> -r = ioctl(v->device_fd, VHOST_GET_BACKEND_FEATURES, &backend_features);
> -if (unlikely(r < 0)) {
> -error_report("Cannot get vdpa backend_features: %s(%d)",
> -g_strerror(errno), errno);
> -return -1;
> +if (!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> +return 0;
>  }
> -if (!(backend_features & BIT_ULL(VHOST_BACKEND_F_IOTLB_ASID)) ||
> -!vhost_vdpa_net_valid_svq_features(v->dev->features, NULL)) {
> +
> +if (!s->cvq_isolated) {
>  return 0;
>  }
>
> -/*
> - * Check if all the virtqueues of the virtio device are in a different vq
> - * than the last vq. VQ group of last group passed in cvq_group.
> - */
> -cvq_index = v->dev->vq_index_end - 1;
> -cvq_group = vhost_vdpa_get_vring_group(v->device_fd, cvq_index);
> +cvq_group = vhost_vdpa_get_vring_group(v->device_fd,
> +   v->dev->vq_index_end - 1,
> +   &err);
>  if (unlikely(cvq_group < 0)) {
> +error_report_err(err);
>  return cvq_group;
>  }
> -for (int i = 0; i < cvq_index; ++i) {
> -int64_t group = vhost_vdpa_get_vring_group(v->device_fd, i);
> -
> -if (unlikely(group < 0)) {
> -return group;
> -}
> -
> -if (group == cvq_group) {
> -return 0;
> -}
> -}
>
>  r = vhost_vdpa_set_address_space_id(v, cvq_group, 
> VHOST_VDPA_NET_CVQ_ASID);
>  if (unlikely(r < 0)) {
> @@ -799,6 +776,111 @@ static const VhostShadowVirtqueueOps 
> vhost_vdpa_net_svq_ops = {
>  .avail_handler = vhost_vdpa_net_handle_ctrl_avail,
>  };
>
> +/**
> + * Probe the device to check control virtqueue is isolated.
> + *
> + * @device_fd vhost-vdpa file descriptor
> + * @features features to negotiate
> + * @cvq_index Control vq index
> + *
> + * Returns -1 in case of error, 0

Re: [PATCH v5 3/3] hw/riscv: Validate cluster and NUMA node boundary

2023-05-16 Thread Alistair Francis

On Tue, May 9, 2023 at 10:29 AM Gavin Shan  wrote:
>
> There are two RISCV machines where NUMA is aware: 'virt' and 'spike'.
> Both of them are required to follow cluster-NUMA-node boundary. To
> enable the validation to warn about the irregular configuration where
> multiple CPUs in one cluster has been associated with multiple NUMA
> nodes.
>
> Signed-off-by: Gavin Shan 
> Reviewed-by: Daniel Henrique Barboza 
> Acked-by: Igor Mammedov 

Acked-by: Alistair Francis 

Alistair

> ---
>  hw/riscv/spike.c | 2 ++
>  hw/riscv/virt.c  | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index 2c5546560a..81f7e53aed 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -354,6 +354,8 @@ static void spike_machine_class_init(ObjectClass *oc, 
> void *data)
>  mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
>  mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
>  mc->numa_mem_supported = true;
> +/* platform instead of architectural choice */
> +mc->cpu_cluster_has_numa_boundary = true;
>  mc->default_ram_id = "riscv.spike.ram";
>  object_class_property_add_str(oc, "signature", NULL, 
> spike_set_signature);
>  object_class_property_set_description(oc, "signature",
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 4e3efbee16..84a2bca460 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1678,6 +1678,8 @@ static void virt_machine_class_init(ObjectClass *oc, 
> void *data)
>  mc->cpu_index_to_instance_props = riscv_numa_cpu_index_to_props;
>  mc->get_default_cpu_node_id = riscv_numa_get_default_cpu_node_id;
>  mc->numa_mem_supported = true;
> +/* platform instead of architectural choice */
> +mc->cpu_cluster_has_numa_boundary = true;
>  mc->default_ram_id = "riscv_virt_board.ram";
>  assert(!mc->get_hotplug_handler);
>  mc->get_hotplug_handler = virt_machine_get_hotplug_handler;
> --
> 2.23.0
>
>

Re: [PATCH v3 2/5] vdpa: add vhost_vdpa_reset_status_fd

2023-05-16 Thread Jason Wang

On Tue, May 9, 2023 at 11:44 PM Eugenio Pérez  wrote:
>
> This allows to reset a vhost-vdpa device from external subsystems like
> vhost-net, since it does not have any struct vhost_dev by the time we
> need to use it.
>
> It is used in subsequent patches to negotiate features
> and probe for CVQ ASID isolation.
>
> Reviewed-by: Stefano Garzarella 
> Signed-off-by: Eugenio Pérez 
> ---
>  include/hw/virtio/vhost-vdpa.h |  1 +
>  hw/virtio/vhost-vdpa.c | 58 +++---
>  2 files changed, 41 insertions(+), 18 deletions(-)
>
> diff --git a/include/hw/virtio/vhost-vdpa.h b/include/hw/virtio/vhost-vdpa.h
> index c278a2a8de..28de7da91e 100644
> --- a/include/hw/virtio/vhost-vdpa.h
> +++ b/include/hw/virtio/vhost-vdpa.h
> @@ -54,6 +54,7 @@ typedef struct vhost_vdpa {
>  VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX];
>  } VhostVDPA;
>
> +void vhost_vdpa_reset_status_fd(int fd);
>  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
> *iova_range);
>
>  int vhost_vdpa_dma_map(struct vhost_vdpa *v, uint32_t asid, hwaddr iova,
> diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
> index bbabea18f3..7a2053b8d9 100644
> --- a/hw/virtio/vhost-vdpa.c
> +++ b/hw/virtio/vhost-vdpa.c
> @@ -335,38 +335,45 @@ static const MemoryListener vhost_vdpa_memory_listener 
> = {
>  .region_del = vhost_vdpa_listener_region_del,
>  };
>
> -static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> - void *arg)
> +static int vhost_vdpa_dev_fd(const struct vhost_dev *dev)
>  {
>  struct vhost_vdpa *v = dev->opaque;
> -int fd = v->device_fd;
> -int ret;
>
>  assert(dev->vhost_ops->backend_type == VHOST_BACKEND_TYPE_VDPA);
> +return v->device_fd;
> +}

Nit: unless the vhost_dev structure is opaque to the upper layer, I
don't see any advantage for having a dedicated indirect helper to get
device_fd.

> +
> +static int vhost_vdpa_call_fd(int fd, unsigned long int request, void *arg)
> +{
> +int ret = ioctl(fd, request, arg);
>
> -ret = ioctl(fd, request, arg);
>  return ret < 0 ? -errno : ret;
>  }
>
> -static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> +static int vhost_vdpa_call(struct vhost_dev *dev, unsigned long int request,
> +   void *arg)
> +{
> +return vhost_vdpa_call_fd(vhost_vdpa_dev_fd(dev), request, arg);
> +}
> +
> +static int vhost_vdpa_add_status_fd(int fd, uint8_t status)
>  {
>  uint8_t s;
>  int ret;
>
> -trace_vhost_vdpa_add_status(dev, status);
> -ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
>  if (ret < 0) {
>  return ret;
>  }
>
>  s |= status;
>
> -ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s);
> +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &s);
>  if (ret < 0) {
>  return ret;
>  }
>
> -ret = vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &s);
> +ret = vhost_vdpa_call_fd(fd, VHOST_VDPA_GET_STATUS, &s);
>  if (ret < 0) {
>  return ret;
>  }
> @@ -378,6 +385,12 @@ static int vhost_vdpa_add_status(struct vhost_dev *dev, 
> uint8_t status)
>  return 0;
>  }
>
> +static int vhost_vdpa_add_status(struct vhost_dev *dev, uint8_t status)
> +{
> +trace_vhost_vdpa_add_status(dev, status);
> +return vhost_vdpa_add_status_fd(vhost_vdpa_dev_fd(dev), status);
> +}
> +
>  int vhost_vdpa_get_iova_range(int fd, struct vhost_vdpa_iova_range 
> *iova_range)
>  {
>  int ret = ioctl(fd, VHOST_VDPA_GET_IOVA_RANGE, iova_range);
> @@ -709,16 +722,20 @@ static int vhost_vdpa_get_device_id(struct vhost_dev 
> *dev,
>  return ret;
>  }
>
> +static int vhost_vdpa_reset_device_fd(int fd)
> +{
> +uint8_t status = 0;
> +
> +return vhost_vdpa_call_fd(fd, VHOST_VDPA_SET_STATUS, &status);
> +}
> +
>  static int vhost_vdpa_reset_device(struct vhost_dev *dev)
>  {
>  struct vhost_vdpa *v = dev->opaque;
> -int ret;
> -uint8_t status = 0;
>
> -ret = vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &status);
> -trace_vhost_vdpa_reset_device(dev);
>  v->suspended = false;
> -return ret;
> +trace_vhost_vdpa_reset_device(dev);
> +return vhost_vdpa_reset_device_fd(vhost_vdpa_dev_fd(dev));
>  }
>
>  static int vhost_vdpa_get_vq_index(struct vhost_dev *dev, int idx)
> @@ -1170,6 +1187,13 @@ static int vhost_vdpa_dev_start(struct vhost_dev *dev, 
> bool started)
>  return 0;
>  }
>
> +void vhost_vdpa_reset_status_fd(int fd)
> +{
> +vhost_vdpa_reset_device_fd(fd);
> +vhost_vdpa_add_status_fd(fd, VIRTIO_CONFIG_S_ACKNOWLEDGE |
> + VIRTIO_CONFIG_S_DRIVER);

I would like to rename this function since it does more than just reset.

Thanks

> +}
> +
>  static void vhost_vdpa_reset_status(struct vhost_dev *dev)
>  {
>  struct vhost_vdpa *v = dev->opaque;
> @@ -1178,9 +1202,7 @@ static void vhost_vdpa_re

Re: [PATCH v4 1/3] target/riscv: smstateen check for fcsr

2023-05-16 Thread Alistair Francis

On Tue, May 2, 2023 at 12:00 AM Mayuresh Chitale
 wrote:
>
> If smstateen is implemented and smtateen0.fcsr is clear and misa.F
> is off then the floating point operations must return illegal
> instruction exception or virtual instruction trap, if relevant.

Do you mind re-wording this commit message? I can't get my head around
it. You talk about returning an illegal instruction exception, but
most of this patch is just adding SMSTATEEN0_FCSR to the write mask if
floating point is disabled.

It looks to me like you are returning an exception trying to access a
floating pointer register if FP is off and SMSTATEEN0_FCSR is not set
(which you describe) but also then only allow changing SMSTATEEN0_FCSR
if the RVF is not enabled, which is where I'm confused.

Your patch seems to be correct, I think the commit message and title
just needs a small tweak. Maybe something like this:

```
target/riscv: smstateen add support for fcsr bit

If smstateen is implemented and SMSTATEEN0.FCSR is zero floating point
CSR access should raise an illegal instruction exception or virtual
equivalent as required.

We also allow the guest to set/unset the FCSR bit, but only if misa.F
== 0, as defined in the spec.
```

Alistair

>
> Signed-off-by: Mayuresh Chitale 
> Reviewed-by: Weiwei Li 
> ---
>  target/riscv/csr.c | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> index 4451bd1263..3f6b824bd2 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -82,6 +82,10 @@ static RISCVException fs(CPURISCVState *env, int csrno)
>  !riscv_cpu_cfg(env)->ext_zfinx) {
>  return RISCV_EXCP_ILLEGAL_INST;
>  }
> +
> +if (!env->debugger && !riscv_cpu_fp_enabled(env)) {
> +return smstateen_acc_ok(env, 0, SMSTATEEN0_FCSR);
> +}
>  #endif
>  return RISCV_EXCP_NONE;
>  }
> @@ -2100,6 +2104,9 @@ static RISCVException write_mstateen0(CPURISCVState 
> *env, int csrno,
>target_ulong new_val)
>  {
>  uint64_t wr_mask = SMSTATEEN_STATEEN | SMSTATEEN0_HSENVCFG;
> +if (!riscv_has_ext(env, RVF)) {
> +wr_mask |= SMSTATEEN0_FCSR;
> +}
>
>  return write_mstateen(env, csrno, wr_mask, new_val);
>  }
> @@ -2173,6 +2180,10 @@ static RISCVException write_hstateen0(CPURISCVState 
> *env, int csrno,
>  {
>  uint64_t wr_mask = SMSTATEEN_STATEEN | SMSTATEEN0_HSENVCFG;
>
> +if (!riscv_has_ext(env, RVF)) {
> +wr_mask |= SMSTATEEN0_FCSR;
> +}
> +
>  return write_hstateen(env, csrno, wr_mask, new_val);
>  }
>
> @@ -2259,6 +2270,10 @@ static RISCVException write_sstateen0(CPURISCVState 
> *env, int csrno,
>  {
>  uint64_t wr_mask = SMSTATEEN_STATEEN | SMSTATEEN0_HSENVCFG;
>
> +if (!riscv_has_ext(env, RVF)) {
> +wr_mask |= SMSTATEEN0_FCSR;
> +}
> +
>  return write_sstateen(env, csrno, wr_mask, new_val);
>  }
>
> --
> 2.34.1
>

[PATCH v2] vfio/pci: Fix a use-after-free issue

2023-05-16 Thread Zhenzhong Duan

vbasedev->name is freed wrongly which leads to garbage VFIO trace log.
Fix it by allocating a dup of vbasedev->name and then free the dup.

Fixes: 2dca1b37a7 ("vfio/pci: add support for VF token")
Suggested-by: Alex Williamson 
Signed-off-by: Zhenzhong Duan 
---
v2: "toke" -> "token", Cedric
Update with Alex suggested change

 hw/vfio/pci.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index bf27a3990564..73874a94de12 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2994,7 +2994,7 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 qemu_uuid_unparse(&vdev->vf_token, uuid);
 name = g_strdup_printf("%s vf_token=%s", vbasedev->name, uuid);
 } else {
-name = vbasedev->name;
+name = g_strdup(vbasedev->name);
 }
 
 ret = vfio_get_device(group, name, vbasedev, errp);
-- 
2.34.1

Re: [PATCH] hw/riscv/virt: Fix the boot logic if pflash0 is specified

2023-05-16 Thread Alistair Francis

On Sun, Apr 23, 2023 at 11:39 PM Yong Li  wrote:
>
> The firmware may be specified with -bios
> and the plfash0 device with option -drive if=pflash.
> If both options are applied, it is intented that the pflash0 will
> store the secure variable and the firmware be the one specified
> by -bios. Explicitly specify "-bios none" if choose to boot from
> pflash0

This seems like the right approach.

Can you update the docs/system/riscv/virt.rst docs to include this information?

Alistair

>
> Signed-off-by: Yong Li 
> Cc: "Zhiwei Liu" 
> ---
>  hw/riscv/virt.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
> index 4e3efbee16..b38b41e685 100644
> --- a/hw/riscv/virt.c
> +++ b/hw/riscv/virt.c
> @@ -1296,10 +1296,12 @@ static void virt_machine_done(Notifier *notifier, 
> void *data)
>  kernel_entry = 0;
>  }
>
> -if (drive_get(IF_PFLASH, 0, 0)) {
> +if (drive_get(IF_PFLASH, 0, 0) &&
> +!strcmp(machine->firmware, "none")) {
>  /*
> - * Pflash was supplied, let's overwrite the address we jump to after
> - * reset to the base of the flash.
> + * If pflash (unit 0) was supplied and at the same time the -bois
> + * is not specified, then let's overwrite the address we jump to
> + * after reset to the base of the flash.
>   */
>  start_addr = virt_memmap[VIRT_FLASH].base;
>  }
> --
> 2.25.1
>
>

Re: [PATCH] target/riscv: Move zc* out of the experimental properties

2023-05-16 Thread Alistair Francis

On Wed, May 10, 2023 at 1:02 PM Weiwei Li  wrote:
>
> Zc* extensions (version 1.0) are ratified.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
>  target/riscv/cpu.c | 16 
>  1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index db0875fb43..99ed9cb80e 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -1571,6 +1571,14 @@ static Property riscv_cpu_extensions[] = {
>
>  DEFINE_PROP_BOOL("zmmul", RISCVCPU, cfg.ext_zmmul, false),
>
> +DEFINE_PROP_BOOL("zca", RISCVCPU, cfg.ext_zca, false),
> +DEFINE_PROP_BOOL("zcb", RISCVCPU, cfg.ext_zcb, false),
> +DEFINE_PROP_BOOL("zcd", RISCVCPU, cfg.ext_zcd, false),
> +DEFINE_PROP_BOOL("zce", RISCVCPU, cfg.ext_zce, false),
> +DEFINE_PROP_BOOL("zcf", RISCVCPU, cfg.ext_zcf, false),
> +DEFINE_PROP_BOOL("zcmp", RISCVCPU, cfg.ext_zcmp, false),
> +DEFINE_PROP_BOOL("zcmt", RISCVCPU, cfg.ext_zcmt, false),
> +
>  /* Vendor-specific custom extensions */
>  DEFINE_PROP_BOOL("xtheadba", RISCVCPU, cfg.ext_xtheadba, false),
>  DEFINE_PROP_BOOL("xtheadbb", RISCVCPU, cfg.ext_xtheadbb, false),
> @@ -1588,14 +1596,6 @@ static Property riscv_cpu_extensions[] = {
>  /* These are experimental so mark with 'x-' */
>  DEFINE_PROP_BOOL("x-zicond", RISCVCPU, cfg.ext_zicond, false),
>
> -DEFINE_PROP_BOOL("x-zca", RISCVCPU, cfg.ext_zca, false),
> -DEFINE_PROP_BOOL("x-zcb", RISCVCPU, cfg.ext_zcb, false),
> -DEFINE_PROP_BOOL("x-zcd", RISCVCPU, cfg.ext_zcd, false),
> -DEFINE_PROP_BOOL("x-zce", RISCVCPU, cfg.ext_zce, false),
> -DEFINE_PROP_BOOL("x-zcf", RISCVCPU, cfg.ext_zcf, false),
> -DEFINE_PROP_BOOL("x-zcmp", RISCVCPU, cfg.ext_zcmp, false),
> -DEFINE_PROP_BOOL("x-zcmt", RISCVCPU, cfg.ext_zcmt, false),
> -
>  /* ePMP 0.9.3 */
>  DEFINE_PROP_BOOL("x-epmp", RISCVCPU, cfg.epmp, false),
>  DEFINE_PROP_BOOL("x-smaia", RISCVCPU, cfg.ext_smaia, false),
> --
> 2.25.1
>
>

Re: [PATCH v5 13/13] target/riscv: Deny access if access is partially inside the PMP entry

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:38 AM Weiwei Li  wrote:
>
> Access will fails if access is partially inside the PMP entry.
> However,only set ret = false doesn't really mean pmp violation
> since pmp_hart_has_privs_default() may return true at the end of
> pmp_hart_has_privs().
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 317c28ba73..1ee8899d04 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -327,8 +327,8 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong 
> addr,
>  if ((s + e) == 1) {
>  qemu_log_mask(LOG_GUEST_ERROR,
>"pmp violation - access is partially inside\n");
> -ret = false;
> -break;
> +*allowed_privs = 0;
> +return false;
>  }
>
>  /* fully inside */
> --
> 2.25.1
>
>

Re: [PATCH v5 12/13] target/riscv: Separate pmp_update_rule() in pmpcfg_csr_write

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:38 AM Weiwei Li  wrote:
>
> Use pmp_update_rule_addr() and pmp_update_rule_nums() separately to
> update rule nums only once for each pmpcfg_csr_write. Then remove
> pmp_update_rule() since it become unused.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 16 ++--
>  1 file changed, 2 insertions(+), 14 deletions(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 330f61b0f1..317c28ba73 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -29,7 +29,6 @@
>  static bool pmp_write_cfg(CPURISCVState *env, uint32_t addr_index,
>uint8_t val);
>  static uint8_t pmp_read_cfg(CPURISCVState *env, uint32_t addr_index);
> -static void pmp_update_rule(CPURISCVState *env, uint32_t pmp_index);
>
>  /*
>   * Accessor method to extract address matching type 'a field' from cfg reg
> @@ -121,7 +120,7 @@ static bool pmp_write_cfg(CPURISCVState *env, uint32_t 
> pmp_index, uint8_t val)
>  qemu_log_mask(LOG_GUEST_ERROR, "ignoring pmpcfg write - 
> locked\n");
>  } else if (env->pmp_state.pmp[pmp_index].cfg_reg != val) {
>  env->pmp_state.pmp[pmp_index].cfg_reg = val;
> -pmp_update_rule(env, pmp_index);
> +pmp_update_rule_addr(env, pmp_index);
>  return true;
>  }
>  } else {
> @@ -209,18 +208,6 @@ void pmp_update_rule_nums(CPURISCVState *env)
>  }
>  }
>
> -/*
> - * Convert cfg/addr reg values here into simple 'sa' --> start address and 
> 'ea'
> - *   end address values.
> - *   This function is called relatively infrequently whereas the check that
> - *   an address is within a pmp rule is called often, so optimise that one
> - */
> -static void pmp_update_rule(CPURISCVState *env, uint32_t pmp_index)
> -{
> -pmp_update_rule_addr(env, pmp_index);
> -pmp_update_rule_nums(env);
> -}
> -
>  static int pmp_is_in_range(CPURISCVState *env, int pmp_index,
> target_ulong addr)
>  {
> @@ -481,6 +468,7 @@ void pmpcfg_csr_write(CPURISCVState *env, uint32_t 
> reg_index,
>
>  /* If PMP permission of any addr has been changed, flush TLB pages. */
>  if (modified) {
> +pmp_update_rule_nums(env);
>  tlb_flush(env_cpu(env));
>  }
>  }
> --
> 2.25.1
>
>

Re: [PATCH v5 08/13] target/riscv: Update the next rule addr in pmpaddr_csr_write()

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:38 AM Weiwei Li  wrote:
>
> Currently only the rule addr of the same index of pmpaddr is updated
> when pmpaddr CSR is modified. However, the rule addr of next PMP entry
> may also be affected if its A field is PMP_AMATCH_TOR. So we should
> also update it in this case.
>
> Write to pmpaddr CSR will not affect the rule nums, So we needn't update
> call pmp_update_rule_nums()  in pmpaddr_csr_write().
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 80889a1185..3af2caff31 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -507,6 +507,7 @@ void pmpaddr_csr_write(CPURISCVState *env, uint32_t 
> addr_index,
> target_ulong val)
>  {
>  trace_pmpaddr_csr_write(env->mhartid, addr_index, val);
> +bool is_next_cfg_tor = false;
>
>  if (addr_index < MAX_RISCV_PMPS) {
>  /*
> @@ -515,9 +516,9 @@ void pmpaddr_csr_write(CPURISCVState *env, uint32_t 
> addr_index,
>   */
>  if (addr_index + 1 < MAX_RISCV_PMPS) {
>  uint8_t pmp_cfg = env->pmp_state.pmp[addr_index + 1].cfg_reg;
> +is_next_cfg_tor = PMP_AMATCH_TOR == pmp_get_a_field(pmp_cfg);
>
> -if (pmp_cfg & PMP_LOCK &&
> -PMP_AMATCH_TOR == pmp_get_a_field(pmp_cfg)) {
> +if (pmp_cfg & PMP_LOCK && is_next_cfg_tor) {
>  qemu_log_mask(LOG_GUEST_ERROR,
>"ignoring pmpaddr write - pmpcfg + 1 
> locked\n");
>  return;
> @@ -526,7 +527,10 @@ void pmpaddr_csr_write(CPURISCVState *env, uint32_t 
> addr_index,
>
>  if (!pmp_is_locked(env, addr_index)) {
>  env->pmp_state.pmp[addr_index].addr_reg = val;
> -pmp_update_rule(env, addr_index);
> +pmp_update_rule_addr(env, addr_index);
> +if (is_next_cfg_tor) {
> +pmp_update_rule_addr(env, addr_index + 1);
> +}
>  } else {
>  qemu_log_mask(LOG_GUEST_ERROR,
>"ignoring pmpaddr write - locked\n");
> --
> 2.25.1
>
>

Re: [PATCH v5 05/13] target/riscv: Make RLB/MML/MMWP bits writable only when Smepmp is enabled

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:37 AM Weiwei Li  wrote:
>
> RLB/MML/MMWP bits in mseccfg CSR are introduced by Smepmp extension.
> So they can only be writable and set to 1s when cfg.epmp is true.
> Then we also need't check on epmp in pmp_hart_has_privs_default().
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 50 --
>  1 file changed, 26 insertions(+), 24 deletions(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index b5808538aa..e745842973 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -243,30 +243,28 @@ static bool pmp_hart_has_privs_default(CPURISCVState 
> *env, target_ulong addr,
>  {
>  bool ret;
>
> -if (riscv_cpu_cfg(env)->epmp) {
> -if (MSECCFG_MMWP_ISSET(env)) {
> -/*
> - * The Machine Mode Whitelist Policy (mseccfg.MMWP) is set
> - * so we default to deny all, even for M-mode.
> - */
> +if (MSECCFG_MMWP_ISSET(env)) {
> +/*
> + * The Machine Mode Whitelist Policy (mseccfg.MMWP) is set
> + * so we default to deny all, even for M-mode.
> + */
> +*allowed_privs = 0;
> +return false;
> +} else if (MSECCFG_MML_ISSET(env)) {
> +/*
> + * The Machine Mode Lockdown (mseccfg.MML) bit is set
> + * so we can only execute code in M-mode with an applicable
> + * rule. Other modes are disabled.
> + */
> +if (mode == PRV_M && !(privs & PMP_EXEC)) {
> +ret = true;
> +*allowed_privs = PMP_READ | PMP_WRITE;
> +} else {
> +ret = false;
>  *allowed_privs = 0;
> -return false;
> -} else if (MSECCFG_MML_ISSET(env)) {
> -/*
> - * The Machine Mode Lockdown (mseccfg.MML) bit is set
> - * so we can only execute code in M-mode with an applicable
> - * rule. Other modes are disabled.
> - */
> -if (mode == PRV_M && !(privs & PMP_EXEC)) {
> -ret = true;
> -*allowed_privs = PMP_READ | PMP_WRITE;
> -} else {
> -ret = false;
> -*allowed_privs = 0;
> -}
> -
> -return ret;
>  }
> +
> +return ret;
>  }
>
>  if (!riscv_cpu_cfg(env)->pmp || (mode == PRV_M)) {
> @@ -580,8 +578,12 @@ void mseccfg_csr_write(CPURISCVState *env, target_ulong 
> val)
>  }
>  }
>
> -/* Sticky bits */
> -val |= (env->mseccfg & (MSECCFG_MMWP | MSECCFG_MML));
> +if (riscv_cpu_cfg(env)->epmp) {
> +/* Sticky bits */
> +val |= (env->mseccfg & (MSECCFG_MMWP | MSECCFG_MML));
> +} else {
> +val &= ~(MSECCFG_MMWP | MSECCFG_MML | MSECCFG_RLB);
> +}
>
>  env->mseccfg = val;
>  }
> --
> 2.25.1
>
>

Re: [PATCH v5 07/13] target/riscv: Flush TLB when MMWP or MML bits are changed

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:39 AM Weiwei Li  wrote:
>
> MMWP and MML bits may affect the allowed privs of PMP entries and the
> default privs, both of which may change the allowed privs of exsited
> TLB entries. So we need flush TLB when they are changed.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index d2d8429277..80889a1185 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -578,6 +578,9 @@ void mseccfg_csr_write(CPURISCVState *env, target_ulong 
> val)
>  if (riscv_cpu_cfg(env)->epmp) {
>  /* Sticky bits */
>  val |= (env->mseccfg & (MSECCFG_MMWP | MSECCFG_MML));
> +if ((val ^ env->mseccfg) & (MSECCFG_MMWP | MSECCFG_MML)) {
> +tlb_flush(env_cpu(env));
> +}
>  } else {
>  val &= ~(MSECCFG_MMWP | MSECCFG_MML | MSECCFG_RLB);
>  }
> --
> 2.25.1
>
>

Re: [PATCH v5 06/13] target/riscv: Remove unused paramters in pmp_hart_has_privs_default()

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:38 AM Weiwei Li  wrote:
>
> The addr and size parameters in pmp_hart_has_privs_default() are unused.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 9 +++--
>  1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index e745842973..d2d8429277 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -236,8 +236,7 @@ static int pmp_is_in_range(CPURISCVState *env, int 
> pmp_index,
>  /*
>   * Check if the address has required RWX privs when no PMP entry is matched.
>   */
> -static bool pmp_hart_has_privs_default(CPURISCVState *env, target_ulong addr,
> -   target_ulong size, pmp_priv_t privs,
> +static bool pmp_hart_has_privs_default(CPURISCVState *env, pmp_priv_t privs,
> pmp_priv_t *allowed_privs,
> target_ulong mode)
>  {
> @@ -309,8 +308,7 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong 
> addr,
>
>  /* Short cut if no rules */
>  if (0 == pmp_get_num_rules(env)) {
> -return pmp_hart_has_privs_default(env, addr, size, privs,
> -  allowed_privs, mode);
> +return pmp_hart_has_privs_default(env, privs, allowed_privs, mode);
>  }
>
>  if (size == 0) {
> @@ -454,8 +452,7 @@ bool pmp_hart_has_privs(CPURISCVState *env, target_ulong 
> addr,
>
>  /* No rule matched */
>  if (!ret) {
> -ret = pmp_hart_has_privs_default(env, addr, size, privs,
> - allowed_privs, mode);
> +ret = pmp_hart_has_privs_default(env, privs, allowed_privs, mode);
>  }
>
>  return ret;
> --
> 2.25.1
>
>

Re: [PATCH v5 04/13] target/riscv: Change the return type of pmp_hart_has_privs() to bool

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:38 AM Weiwei Li  wrote:
>
> We no longer need the pmp_index for matched PMP entry now.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu_helper.c |  8 
>  target/riscv/pmp.c| 32 +---
>  target/riscv/pmp.h|  8 
>  3 files changed, 21 insertions(+), 27 deletions(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 83c9699a6d..1868766082 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -685,16 +685,16 @@ static int get_physical_address_pmp(CPURISCVState *env, 
> int *prot, hwaddr addr,
>  int mode)
>  {
>  pmp_priv_t pmp_priv;
> -int pmp_index = -1;
> +bool pmp_has_privs;
>
>  if (!riscv_cpu_cfg(env)->pmp) {
>  *prot = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
>  return TRANSLATE_SUCCESS;
>  }
>
> -pmp_index = pmp_hart_has_privs(env, addr, size, 1 << access_type,
> -   &pmp_priv, mode);
> -if (pmp_index < 0) {
> +pmp_has_privs = pmp_hart_has_privs(env, addr, size, 1 << access_type,
> +   &pmp_priv, mode);
> +if (!pmp_has_privs) {
>  *prot = 0;
>  return TRANSLATE_PMP_FAIL;
>  }
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 86abe1e7cd..b5808538aa 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -296,27 +296,23 @@ static bool pmp_hart_has_privs_default(CPURISCVState 
> *env, target_ulong addr,
>
>  /*
>   * Check if the address has required RWX privs to complete desired operation
> - * Return PMP rule index if a pmp rule match
> - * Return MAX_RISCV_PMPS if default match
> - * Return negtive value if no match
> + * Return true if a pmp rule match or default match
> + * Return false if no match
>   */
> -int pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
> -   target_ulong size, pmp_priv_t privs,
> -   pmp_priv_t *allowed_privs, target_ulong mode)
> +bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
> +target_ulong size, pmp_priv_t privs,
> +pmp_priv_t *allowed_privs, target_ulong mode)
>  {
>  int i = 0;
> -int ret = -1;
> +bool ret = false;
>  int pmp_size = 0;
>  target_ulong s = 0;
>  target_ulong e = 0;
>
>  /* Short cut if no rules */
>  if (0 == pmp_get_num_rules(env)) {
> -if (pmp_hart_has_privs_default(env, addr, size, privs,
> -   allowed_privs, mode)) {
> -ret = MAX_RISCV_PMPS;
> -}
> -return ret;
> +return pmp_hart_has_privs_default(env, addr, size, privs,
> +  allowed_privs, mode);
>  }
>
>  if (size == 0) {
> @@ -345,7 +341,7 @@ int pmp_hart_has_privs(CPURISCVState *env, target_ulong 
> addr,
>  if ((s + e) == 1) {
>  qemu_log_mask(LOG_GUEST_ERROR,
>"pmp violation - access is partially inside\n");
> -ret = -1;
> +ret = false;
>  break;
>  }
>
> @@ -453,17 +449,15 @@ int pmp_hart_has_privs(CPURISCVState *env, target_ulong 
> addr,
>   * defined with PMP must be used. We shouldn't fallback on
>   * finding default privileges.
>   */
> -ret = i;
> +ret = true;
>  break;
>  }
>  }
>
>  /* No rule matched */
> -if (ret == -1) {
> -if (pmp_hart_has_privs_default(env, addr, size, privs,
> -   allowed_privs, mode)) {
> -ret = MAX_RISCV_PMPS;
> -}
> +if (!ret) {
> +ret = pmp_hart_has_privs_default(env, addr, size, privs,
> + allowed_privs, mode);
>  }
>
>  return ret;
> diff --git a/target/riscv/pmp.h b/target/riscv/pmp.h
> index 0a7e24750b..cf5c99f8e6 100644
> --- a/target/riscv/pmp.h
> +++ b/target/riscv/pmp.h
> @@ -72,10 +72,10 @@ target_ulong mseccfg_csr_read(CPURISCVState *env);
>  void pmpaddr_csr_write(CPURISCVState *env, uint32_t addr_index,
> target_ulong val);
>  target_ulong pmpaddr_csr_read(CPURISCVState *env, uint32_t addr_index);
> -int pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
> -   target_ulong size, pmp_priv_t privs,
> -   pmp_priv_t *allowed_privs,
> -   target_ulong mode);
> +bool pmp_hart_has_privs(CPURISCVState *env, target_ulong addr,
> +target_ulong size, pmp_priv_t privs,
> +pmp_priv_t *allowed_privs,
> +target_ulong mode);
>  target_ulong pmp_get_tlb_size(CPURISCVState *env, target_ulong addr);
>  void pmp_update_rule_add

Re: [PATCH v5 03/13] target/riscv: Make the short cut really work in pmp_hart_has_privs

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:38 AM Weiwei Li  wrote:
>
> Return the result directly for short cut, since We needn't do the
> following check on the PMP entries if there is no PMP rules.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/pmp.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index ad20a319c1..86abe1e7cd 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -316,6 +316,7 @@ int pmp_hart_has_privs(CPURISCVState *env, target_ulong 
> addr,
> allowed_privs, mode)) {
>  ret = MAX_RISCV_PMPS;
>  }
> +return ret;
>  }
>
>  if (size == 0) {
> --
> 2.25.1
>
>

Re: [PATCH v5 02/13] target/riscv: Move pmp_get_tlb_size apart from get_physical_address_pmp

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:38 AM Weiwei Li  wrote:
>
> pmp_get_tlb_size can be separated from get_physical_address_pmp and is only
> needed when ret == TRANSLATE_SUCCESS.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 
> Reviewed-by: LIU Zhiwei 

Reviewed-by: Alistair Francis 

Alistair

> ---
>  target/riscv/cpu_helper.c | 16 ++--
>  1 file changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 075fc0538a..83c9699a6d 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -676,14 +676,11 @@ void riscv_cpu_set_mode(CPURISCVState *env, 
> target_ulong newpriv)
>   *
>   * @env: CPURISCVState
>   * @prot: The returned protection attributes
> - * @tlb_size: TLB page size containing addr. It could be modified after PMP
> - *permission checking. NULL if not set TLB page for addr.
>   * @addr: The physical address to be checked permission
>   * @access_type: The type of MMU access
>   * @mode: Indicates current privilege level.
>   */
> -static int get_physical_address_pmp(CPURISCVState *env, int *prot,
> -target_ulong *tlb_size, hwaddr addr,
> +static int get_physical_address_pmp(CPURISCVState *env, int *prot, hwaddr 
> addr,
>  int size, MMUAccessType access_type,
>  int mode)
>  {
> @@ -703,9 +700,6 @@ static int get_physical_address_pmp(CPURISCVState *env, 
> int *prot,
>  }
>
>  *prot = pmp_priv_to_page_prot(pmp_priv);
> -if (tlb_size != NULL) {
> -*tlb_size = pmp_get_tlb_size(env, addr);
> -}
>
>  return TRANSLATE_SUCCESS;
>  }
> @@ -905,7 +899,7 @@ restart:
>  }
>
>  int pmp_prot;
> -int pmp_ret = get_physical_address_pmp(env, &pmp_prot, NULL, 
> pte_addr,
> +int pmp_ret = get_physical_address_pmp(env, &pmp_prot, pte_addr,
> sizeof(target_ulong),
> MMU_DATA_LOAD, PRV_S);
>  if (pmp_ret != TRANSLATE_SUCCESS) {
> @@ -1300,8 +1294,9 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, 
> int size,
>  prot &= prot2;
>
>  if (ret == TRANSLATE_SUCCESS) {
> -ret = get_physical_address_pmp(env, &prot_pmp, &tlb_size, pa,
> +ret = get_physical_address_pmp(env, &prot_pmp, pa,
> size, access_type, mode);
> +tlb_size = pmp_get_tlb_size(env, pa);
>
>  qemu_log_mask(CPU_LOG_MMU,
>"%s PMP address=" HWADDR_FMT_plx " ret %d prot"
> @@ -1333,8 +1328,9 @@ bool riscv_cpu_tlb_fill(CPUState *cs, vaddr address, 
> int size,
>__func__, address, ret, pa, prot);
>
>  if (ret == TRANSLATE_SUCCESS) {
> -ret = get_physical_address_pmp(env, &prot_pmp, &tlb_size, pa,
> +ret = get_physical_address_pmp(env, &prot_pmp, pa,
> size, access_type, mode);
> +tlb_size = pmp_get_tlb_size(env, pa);
>
>  qemu_log_mask(CPU_LOG_MMU,
>"%s PMP address=" HWADDR_FMT_plx " ret %d prot"
> --
> 2.25.1
>
>

Re: [PATCH v5 01/13] target/riscv: Update pmp_get_tlb_size()

2023-05-16 Thread Alistair Francis

On Sat, Apr 29, 2023 at 12:37 AM Weiwei Li  wrote:
>
> PMP entries before the matched PMP entry (including the matched PMP entry)
> may only cover partial of the TLB page, which may make different regions in
> that page allow different RWX privs. Such as for PMP0 (0x8008~0x800F,
> R) and PMP1 (0x8000~0x8FFF, RWX), write access to 0x8000 will
> match PMP1. However we cannot cache the translation result in the TLB since
> this will make the write access to 0x8008 bypass the check of PMP0. So we
> should check all of them instead of the matched PMP entry in 
> pmp_get_tlb_size()
> and set the tlb_size to 1 in this case.
> Set tlb_size to TARGET_PAGE_SIZE if PMP is not support or there is no PMP 
> rules.
>
> Signed-off-by: Weiwei Li 
> Signed-off-by: Junqiang Wang 
> Reviewed-by: LIU Zhiwei 
> ---
>  target/riscv/cpu_helper.c |  7 ++---
>  target/riscv/pmp.c| 64 ++-
>  target/riscv/pmp.h|  3 +-
>  3 files changed, 52 insertions(+), 22 deletions(-)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 433ea529b0..075fc0538a 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -703,11 +703,8 @@ static int get_physical_address_pmp(CPURISCVState *env, 
> int *prot,
>  }
>
>  *prot = pmp_priv_to_page_prot(pmp_priv);
> -if ((tlb_size != NULL) && pmp_index != MAX_RISCV_PMPS) {
> -target_ulong tlb_sa = addr & ~(TARGET_PAGE_SIZE - 1);
> -target_ulong tlb_ea = tlb_sa + TARGET_PAGE_SIZE - 1;
> -
> -*tlb_size = pmp_get_tlb_size(env, pmp_index, tlb_sa, tlb_ea);
> +if (tlb_size != NULL) {
> +*tlb_size = pmp_get_tlb_size(env, addr);
>  }
>
>  return TRANSLATE_SUCCESS;
> diff --git a/target/riscv/pmp.c b/target/riscv/pmp.c
> index 1f5aca42e8..ad20a319c1 100644
> --- a/target/riscv/pmp.c
> +++ b/target/riscv/pmp.c
> @@ -601,28 +601,62 @@ target_ulong mseccfg_csr_read(CPURISCVState *env)
>  }
>
>  /*
> - * Calculate the TLB size if the start address or the end address of
> - * PMP entry is presented in the TLB page.
> + * Calculate the TLB size. If the PMP rules may make different regions in
> + * the TLB page of 'addr' allow different RWX privs, set the size to 1
> + * (to make the translation result uncached in the TLB and only be used for
> + * a single translation). Set the size to TARGET_PAGE_SIZE otherwise.

I think this could be clearer, something like:

Calculate the TLB size.
If the matching PMP rule only matches a subset of the TLB, it's
possible that earlier higher priority PMP regions will match other
parts of the TLB.
For example if PMP0 is (0x8008~0x800F, > R) and PMP1 is
(0x8000~0x8FFF, RWX) a write access to 0x8000 will match
PMP1. However we cannot cache the translation result in the TLB since
this will make the write access to 0x8008 bypass the check of
PMP0.
To avoid this we return a size of 1 (which means no cacheing) if the
PMP region does not cover the entire TLB.

>   */
> -target_ulong pmp_get_tlb_size(CPURISCVState *env, int pmp_index,
> -  target_ulong tlb_sa, target_ulong tlb_ea)
> +target_ulong pmp_get_tlb_size(CPURISCVState *env, target_ulong addr)
>  {
> -target_ulong pmp_sa = env->pmp_state.addr[pmp_index].sa;
> -target_ulong pmp_ea = env->pmp_state.addr[pmp_index].ea;
> +target_ulong pmp_sa;
> +target_ulong pmp_ea;
> +target_ulong tlb_sa = addr & ~(TARGET_PAGE_SIZE - 1);
> +target_ulong tlb_ea = tlb_sa + TARGET_PAGE_SIZE - 1;
> +int i;
>
> -if (pmp_sa <= tlb_sa && pmp_ea >= tlb_ea) {
> +/*
> + * If PMP is not supported or there is no PMP rule, which means the 
> allowed
> + * RWX privs of the page will not affected by PMP or PMP will provide the
> + * same option (disallow accesses or allow default RWX privs) for all
> + * addresses, set the size to TARGET_PAGE_SIZE.

Sam here:

If PMP is not supported or there are no PMP rules, the permissions of
the page will not affected by PMP so we set the size to
TARGET_PAGE_SIZE.

Otherwise:

Reviewed-by: Alistair Francis 

Alistair

> + */
> +if (!riscv_cpu_cfg(env)->pmp || !pmp_get_num_rules(env)) {
>  return TARGET_PAGE_SIZE;
> -} else {
> +}
> +
> +for (i = 0; i < MAX_RISCV_PMPS; i++) {
> +if (pmp_get_a_field(env->pmp_state.pmp[i].cfg_reg) == 
> PMP_AMATCH_OFF) {
> +continue;
> +}
> +
> +pmp_sa = env->pmp_state.addr[i].sa;
> +pmp_ea = env->pmp_state.addr[i].ea;
> +
>  /*
> - * At this point we have a tlb_size that is the smallest possible 
> size
> - * That fits within a TARGET_PAGE_SIZE and the PMP region.
> - *
> - * If the size is less then TARGET_PAGE_SIZE we drop the size to 1.
> - * This means the result isn't cached in the TLB and is only used for
> - * a single translation.
> + * Only the first PMP entry that covers (whole or p

Re: [PATCH 11/11] tcg/riscv: Support CTZ, CLZ from Zbb

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:57 PM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target-con-set.h |  1 +
>  tcg/riscv/tcg-target.h |  8 
>  tcg/riscv/tcg-target.c.inc | 35 ++
>  3 files changed, 40 insertions(+), 4 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
> index a5cadd303f..aac5ceee2b 100644
> --- a/tcg/riscv/tcg-target-con-set.h
> +++ b/tcg/riscv/tcg-target-con-set.h
> @@ -18,5 +18,6 @@ C_O1_I2(r, r, rI)
>  C_O1_I2(r, r, rJ)
>  C_O1_I2(r, rZ, rN)
>  C_O1_I2(r, rZ, rZ)
> +C_N1_I2(r, r, rM)
>  C_O1_I4(r, r, rI, rM, rM)
>  C_O2_I4(r, r, rZ, rZ, rM, rM)
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index e9e84be9a5..cff5de5c9e 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -125,8 +125,8 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_eqv_i32  have_zbb
>  #define TCG_TARGET_HAS_nand_i32 0
>  #define TCG_TARGET_HAS_nor_i32  0
> -#define TCG_TARGET_HAS_clz_i32  0
> -#define TCG_TARGET_HAS_ctz_i32  0
> +#define TCG_TARGET_HAS_clz_i32  1
> +#define TCG_TARGET_HAS_ctz_i32  1
>  #define TCG_TARGET_HAS_ctpop_i32have_zbb
>  #define TCG_TARGET_HAS_brcond2  1
>  #define TCG_TARGET_HAS_setcond2 1
> @@ -159,8 +159,8 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_eqv_i64  have_zbb
>  #define TCG_TARGET_HAS_nand_i64 0
>  #define TCG_TARGET_HAS_nor_i64  0
> -#define TCG_TARGET_HAS_clz_i64  0
> -#define TCG_TARGET_HAS_ctz_i64  0
> +#define TCG_TARGET_HAS_clz_i64  1
> +#define TCG_TARGET_HAS_ctz_i64  1
>  #define TCG_TARGET_HAS_ctpop_i64have_zbb
>  #define TCG_TARGET_HAS_add2_i64 1
>  #define TCG_TARGET_HAS_sub2_i64 1
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 1c57b64182..a1c92b0603 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -1063,6 +1063,22 @@ static void tcg_out_movcond(TCGContext *s, TCGCond 
> cond, TCGReg ret,
>  }
>  }
>
> +static void tcg_out_cltz(TCGContext *s, TCGType type, RISCVInsn insn,
> + TCGReg ret, TCGReg src1, int src2, bool c_src2)
> +{
> +tcg_out_opc_imm(s, insn, ret, src1, 0);
> +
> +if (!c_src2 || src2 != (type == TCG_TYPE_I32 ? 32 : 64)) {
> +/*
> + * The requested zero result does not match the insn, so adjust.
> + * Note that constraints put 'ret' in a new register, so the
> + * computation above did not clobber either 'src1' or 'src2'.
> + */
> +tcg_out_movcond(s, TCG_COND_EQ, ret, src1, 0, true,
> +src2, c_src2, ret, false);
> +}
> +}
> +
>  static void tcg_out_call_int(TCGContext *s, const tcg_insn_unit *arg, bool 
> tail)
>  {
>  TCGReg link = tail ? TCG_REG_ZERO : TCG_REG_RA;
> @@ -1724,6 +1740,19 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  tcg_out_opc_imm(s, OPC_CPOP, a0, a1, 0);
>  break;
>
> +case INDEX_op_clz_i32:
> +tcg_out_cltz(s, TCG_TYPE_I32, OPC_CLZW, a0, a1, a2, c2);
> +break;
> +case INDEX_op_clz_i64:
> +tcg_out_cltz(s, TCG_TYPE_I64, OPC_CLZ, a0, a1, a2, c2);
> +break;
> +case INDEX_op_ctz_i32:
> +tcg_out_cltz(s, TCG_TYPE_I32, OPC_CTZW, a0, a1, a2, c2);
> +break;
> +case INDEX_op_ctz_i64:
> +tcg_out_cltz(s, TCG_TYPE_I64, OPC_CTZ, a0, a1, a2, c2);
> +break;
> +
>  case INDEX_op_add2_i32:
>  tcg_out_addsub2(s, a0, a1, a2, args[3], args[4], args[5],
>  const_args[4], const_args[5], false, true);
> @@ -1917,6 +1946,12 @@ static TCGConstraintSetIndex 
> tcg_target_op_def(TCGOpcode op)
>  case INDEX_op_rotr_i64:
>  return C_O1_I2(r, r, ri);
>
> +case INDEX_op_clz_i32:
> +case INDEX_op_clz_i64:
> +case INDEX_op_ctz_i32:
> +case INDEX_op_ctz_i64:
> +return C_N1_I2(r, r, rM);
> +
>  case INDEX_op_brcond_i32:
>  case INDEX_op_brcond_i64:
>  return C_O0_I2(rZ, rZ);
> --
> 2.34.1
>
>

[PULL v2 00/74] tcg patch queue

2023-05-16 Thread Richard Henderson

v2: Drop a few patches, which showed regressions in CI
for jobs that are not run for forks.  :-/


r~


The following changes since commit f9d58e0ca53b3f470b84725a7b5e47fcf446a2ea:

  Merge tag 'pull-9p-20230516' of https://github.com/cschoenebeck/qemu into 
staging (2023-05-16 10:21:44 -0700)

are available in the Git repository at:

  https://gitlab.com/rth7680/qemu.git tags/pull-tcg-20230516-2

for you to fetch changes up to 44fe8f47fce3bdc8dcf49e3f001519a375ecc88a:

  tcg: Split out exec/user/guest-base.h (2023-05-16 16:31:05 -0700)


tcg/i386: Fix tcg_out_addi_ptr for win64
tcg: Implement atomicity for TCGv_i128
tcg: First quarter of cleanups for building tcg once


Richard Henderson (74):
  tcg/i386: Set P_REXW in tcg_out_addi_ptr
  include/exec/memop: Add MO_ATOM_*
  accel/tcg: Honor atomicity of loads
  accel/tcg: Honor atomicity of stores
  tcg: Unify helper_{be,le}_{ld,st}*
  accel/tcg: Implement helper_{ld,st}*_mmu for user-only
  tcg/tci: Use helper_{ld,st}*_mmu for user-only
  tcg: Add 128-bit guest memory primitives
  meson: Detect atomic128 support with optimization
  tcg/i386: Add have_atomic16
  tcg/aarch64: Detect have_lse, have_lse2 for linux
  tcg/aarch64: Detect have_lse, have_lse2 for darwin
  tcg/i386: Use full load/store helpers in user-only mode
  tcg/aarch64: Use full load/store helpers in user-only mode
  tcg/ppc: Use full load/store helpers in user-only mode
  tcg/loongarch64: Use full load/store helpers in user-only mode
  tcg/riscv: Use full load/store helpers in user-only mode
  tcg/arm: Adjust constraints on qemu_ld/st
  tcg/arm: Use full load/store helpers in user-only mode
  tcg/mips: Use full load/store helpers in user-only mode
  tcg/s390x: Use full load/store helpers in user-only mode
  tcg/sparc64: Allocate %g2 as a third temporary
  tcg/sparc64: Rename tcg_out_movi_imm13 to tcg_out_movi_s13
  target/sparc64: Remove tcg_out_movi_s13 case from tcg_out_movi_imm32
  tcg/sparc64: Rename tcg_out_movi_imm32 to tcg_out_movi_u32
  tcg/sparc64: Split out tcg_out_movi_s32
  tcg/sparc64: Use standard slow path for softmmu
  accel/tcg: Remove helper_unaligned_{ld,st}
  tcg/loongarch64: Check the host supports unaligned accesses
  tcg/loongarch64: Support softmmu unaligned accesses
  tcg/riscv: Support softmmu unaligned accesses
  tcg: Introduce tcg_target_has_memory_bswap
  tcg: Add INDEX_op_qemu_{ld,st}_i128
  tcg: Introduce tcg_out_movext3
  tcg: Merge tcg_out_helper_load_regs into caller
  tcg: Support TCG_TYPE_I128 in tcg_out_{ld,st}_helper_{args,ret}
  tcg: Introduce atom_and_align_for_opc
  tcg/i386: Use atom_and_align_for_opc
  tcg/aarch64: Use atom_and_align_for_opc
  tcg/arm: Use atom_and_align_for_opc
  tcg/loongarch64: Use atom_and_align_for_opc
  tcg/mips: Use atom_and_align_for_opc
  tcg/ppc: Use atom_and_align_for_opc
  tcg/riscv: Use atom_and_align_for_opc
  tcg/s390x: Use atom_and_align_for_opc
  tcg/sparc64: Use atom_and_align_for_opc
  tcg: Split out memory ops to tcg-op-ldst.c
  tcg: Widen gen_insn_data to uint64_t
  accel/tcg: Widen tcg-ldst.h addresses to uint64_t
  tcg: Widen helper_{ld,st}_i128 addresses to uint64_t
  tcg: Widen helper_atomic_* addresses to uint64_t
  tcg: Widen tcg_gen_code pc_start argument to uint64_t
  accel/tcg: Merge gen_mem_wrapped with plugin_gen_empty_mem_callback
  accel/tcg: Merge do_gen_mem_cb into caller
  tcg: Reduce copies for plugin_gen_mem_callbacks
  accel/tcg: Widen plugin_gen_empty_mem_callback to i64
  tcg: Add addr_type to TCGContext
  tcg: Remove TCGv from tcg_gen_qemu_{ld,st}_*
  tcg: Remove TCGv from tcg_gen_atomic_*
  tcg: Split INDEX_op_qemu_{ld,st}* for guest address size
  tcg/tci: Elimnate TARGET_LONG_BITS, target_ulong
  tcg/i386: Always enable TCG_TARGET_HAS_extr[lh]_i64_i32
  tcg/i386: Conditionalize tcg_out_extu_i32_i64
  tcg/i386: Adjust type of tlb_mask
  tcg/i386: Remove TARGET_LONG_BITS, TCG_TYPE_TL
  tcg/arm: Remove TARGET_LONG_BITS
  tcg/aarch64: Remove USE_GUEST_BASE
  tcg/aarch64: Remove TARGET_LONG_BITS, TCG_TYPE_TL
  tcg/loongarch64: Remove TARGET_LONG_BITS, TCG_TYPE_TL
  tcg/mips: Remove TARGET_LONG_BITS, TCG_TYPE_TL
  tcg: Remove TARGET_LONG_BITS, TCG_TYPE_TL
  tcg: Add page_bits and page_mask to TCGContext
  tcg: Add tlb_dyn_max_bits to TCGContext
  tcg: Split out exec/user/guest-base.h

 docs/devel/loads-stores.rst  |   36 +-
 docs/devel/tcg-ops.rst   |   11 +-
 meson.build  |   52 +-
 accel/tcg/tcg-runtime.h  |   49 +-
 include/exec/cpu-all.h   |5 +-
 include/exec/memop.h |   37 ++
 include/exec/pl

Re: [PATCH 10/11] tcg/riscv: Implement movcond

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:59 PM Richard Henderson
 wrote:
>
> Implement with and without Zicond.  Without Zicond, we were letting
> the middle-end expand to a 5 insn sequence; better to use a branch
> over a single insn.
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target-con-set.h |   1 +
>  tcg/riscv/tcg-target.h |   4 +-
>  tcg/riscv/tcg-target.c.inc | 139 -
>  3 files changed, 141 insertions(+), 3 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
> index 1a33ece98f..a5cadd303f 100644
> --- a/tcg/riscv/tcg-target-con-set.h
> +++ b/tcg/riscv/tcg-target-con-set.h
> @@ -18,4 +18,5 @@ C_O1_I2(r, r, rI)
>  C_O1_I2(r, r, rJ)
>  C_O1_I2(r, rZ, rN)
>  C_O1_I2(r, rZ, rZ)
> +C_O1_I4(r, r, rI, rM, rM)
>  C_O2_I4(r, r, rZ, rZ, rM, rM)
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index e0b23006c4..e9e84be9a5 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -97,7 +97,7 @@ extern bool have_zbb;
>  #endif
>
>  /* optional instructions */
> -#define TCG_TARGET_HAS_movcond_i32  0
> +#define TCG_TARGET_HAS_movcond_i32  1
>  #define TCG_TARGET_HAS_div_i32  1
>  #define TCG_TARGET_HAS_rem_i32  1
>  #define TCG_TARGET_HAS_div2_i32 0
> @@ -132,7 +132,7 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_setcond2 1
>  #define TCG_TARGET_HAS_qemu_st8_i32 0
>
> -#define TCG_TARGET_HAS_movcond_i64  0
> +#define TCG_TARGET_HAS_movcond_i64  1
>  #define TCG_TARGET_HAS_div_i64  1
>  #define TCG_TARGET_HAS_rem_i64  1
>  #define TCG_TARGET_HAS_div2_i64 0
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 84b646105c..1c57b64182 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -169,7 +169,7 @@ static bool tcg_target_const_match(int64_t val, TCGType 
> type, int ct)
>  }
>  /*
>   * Sign extended from 12 bits, +/- matching: [-0x7ff, 0x7ff].
> - * Used by addsub2, which may need the negative operation,
> + * Used by addsub2 and movcond, which may need the negative value,
>   * and requires the modified constant to be representable.
>   */
>  if ((ct & TCG_CT_CONST_M12) && val >= -0x7ff && val <= 0x7ff) {
> @@ -936,6 +936,133 @@ static void tcg_out_setcond(TCGContext *s, TCGCond 
> cond, TCGReg ret,
>  }
>  }
>
> +static void tcg_out_movcond_zicond(TCGContext *s, TCGReg ret, TCGReg test_ne,
> +   int val1, bool c_val1,
> +   int val2, bool c_val2)
> +{
> +if (val1 == 0) {
> +if (c_val2) {
> +tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_TMP1, val2);
> +val2 = TCG_REG_TMP1;
> +}
> +tcg_out_opc_reg(s, OPC_CZERO_NEZ, ret, val2, test_ne);
> +return;
> +}
> +
> +if (val2 == 0) {
> +if (c_val1) {
> +tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_TMP1, val1);
> +val1 = TCG_REG_TMP1;
> +}
> +tcg_out_opc_reg(s, OPC_CZERO_EQZ, ret, val1, test_ne);
> +return;
> +}
> +
> +if (c_val2) {
> +if (c_val1) {
> +tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_TMP1, val1 - val2);
> +} else {
> +tcg_out_opc_imm(s, OPC_ADDI, TCG_REG_TMP1, val1, -val2);
> +}
> +tcg_out_opc_reg(s, OPC_CZERO_EQZ, ret, TCG_REG_TMP1, test_ne);
> +tcg_out_opc_imm(s, OPC_ADDI, ret, ret, val2);
> +return;
> +}
> +
> +if (c_val1) {
> +tcg_out_opc_imm(s, OPC_ADDI, TCG_REG_TMP1, val2, -val1);
> +tcg_out_opc_reg(s, OPC_CZERO_NEZ, ret, TCG_REG_TMP1, test_ne);
> +tcg_out_opc_imm(s, OPC_ADDI, ret, ret, val1);
> +return;
> +}
> +
> +tcg_out_opc_reg(s, OPC_CZERO_NEZ, TCG_REG_TMP1, val2, test_ne);
> +tcg_out_opc_reg(s, OPC_CZERO_EQZ, TCG_REG_TMP0, val1, test_ne);
> +tcg_out_opc_reg(s, OPC_OR, ret, TCG_REG_TMP0, TCG_REG_TMP1);
> +}
> +
> +static void tcg_out_movcond_br1(TCGContext *s, TCGCond cond, TCGReg ret,
> +TCGReg cmp1, TCGReg cmp2,
> +int val, bool c_val)
> +{
> +RISCVInsn op;
> +int disp = 8;
> +
> +tcg_debug_assert((unsigned)cond < ARRAY_SIZE(tcg_brcond_to_riscv));
> +op = tcg_brcond_to_riscv[cond].op;
> +tcg_debug_assert(op != 0);
> +
> +if (tcg_brcond_to_riscv[cond].swap) {
> +tcg_out_opc_branch(s, op, cmp2, cmp1, disp);
> +} else {
> +tcg_out_opc_branch(s, op, cmp1, cmp2, disp);
> +}
> +if (c_val) {
> +tcg_out_opc_imm(s, OPC_ADDI, ret, TCG_REG_ZERO, val);
> +} else {
> +tcg_out_opc_imm(s, OPC_ADDI, ret, val, 0);
> +}
> +}
> +
> +static void tcg_out_movcond_br2(TCGContext *s, TCGCond cond, TCGReg ret,
> +TCGReg cmp1, TCGReg cmp2,
> +int val1

Re: [PATCH 09/11] tcg/riscv: Improve setcond expansion

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:59 PM Richard Henderson
 wrote:
>
> Split out a helper function, tcg_out_setcond_int, which does not
> always produce the complete boolean result, but returns a set of
> flags to do so.
>
> Based on 21af16198425, the same improvement for loongarch64.
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.c.inc | 164 +++--
>  1 file changed, 121 insertions(+), 43 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 044ddfb160..84b646105c 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -812,50 +812,128 @@ static void tcg_out_brcond(TCGContext *s, TCGCond 
> cond, TCGReg arg1,
>  tcg_out_opc_branch(s, op, arg1, arg2, 0);
>  }
>
> -static void tcg_out_setcond(TCGContext *s, TCGCond cond, TCGReg ret,
> -TCGReg arg1, TCGReg arg2)
> +#define SETCOND_INVTCG_TARGET_NB_REGS
> +#define SETCOND_NEZ(SETCOND_INV << 1)
> +#define SETCOND_FLAGS  (SETCOND_INV | SETCOND_NEZ)
> +
> +static int tcg_out_setcond_int(TCGContext *s, TCGCond cond, TCGReg ret,
> +   TCGReg arg1, tcg_target_long arg2, bool c2)
>  {
> +int flags = 0;
> +
>  switch (cond) {
> -case TCG_COND_EQ:
> -tcg_out_opc_reg(s, OPC_SUB, ret, arg1, arg2);
> -tcg_out_opc_imm(s, OPC_SLTIU, ret, ret, 1);
> -break;
> -case TCG_COND_NE:
> -tcg_out_opc_reg(s, OPC_SUB, ret, arg1, arg2);
> -tcg_out_opc_reg(s, OPC_SLTU, ret, TCG_REG_ZERO, ret);
> -break;
> -case TCG_COND_LT:
> -tcg_out_opc_reg(s, OPC_SLT, ret, arg1, arg2);
> -break;
> -case TCG_COND_GE:
> -tcg_out_opc_reg(s, OPC_SLT, ret, arg1, arg2);
> -tcg_out_opc_imm(s, OPC_XORI, ret, ret, 1);
> -break;
> -case TCG_COND_LE:
> -tcg_out_opc_reg(s, OPC_SLT, ret, arg2, arg1);
> -tcg_out_opc_imm(s, OPC_XORI, ret, ret, 1);
> -break;
> -case TCG_COND_GT:
> -tcg_out_opc_reg(s, OPC_SLT, ret, arg2, arg1);
> -break;
> -case TCG_COND_LTU:
> -tcg_out_opc_reg(s, OPC_SLTU, ret, arg1, arg2);
> -break;
> -case TCG_COND_GEU:
> -tcg_out_opc_reg(s, OPC_SLTU, ret, arg1, arg2);
> -tcg_out_opc_imm(s, OPC_XORI, ret, ret, 1);
> -break;
> -case TCG_COND_LEU:
> -tcg_out_opc_reg(s, OPC_SLTU, ret, arg2, arg1);
> -tcg_out_opc_imm(s, OPC_XORI, ret, ret, 1);
> -break;
> -case TCG_COND_GTU:
> -tcg_out_opc_reg(s, OPC_SLTU, ret, arg2, arg1);
> +case TCG_COND_EQ:/* -> NE  */
> +case TCG_COND_GE:/* -> LT  */
> +case TCG_COND_GEU:   /* -> LTU */
> +case TCG_COND_GT:/* -> LE  */
> +case TCG_COND_GTU:   /* -> LEU */
> +cond = tcg_invert_cond(cond);
> +flags ^= SETCOND_INV;
>  break;
>  default:
> - g_assert_not_reached();
> - break;
> - }
> +break;
> +}
> +
> +switch (cond) {
> +case TCG_COND_LE:
> +case TCG_COND_LEU:
> +/*
> + * If we have a constant input, the most efficient way to implement
> + * LE is by adding 1 and using LT.  Watch out for wrap around for 
> LEU.
> + * We don't need to care for this for LE because the constant input
> + * is constrained to signed 12-bit, and 0x800 is representable in the
> + * temporary register.
> + */
> +if (c2) {
> +if (cond == TCG_COND_LEU) {
> +/* unsigned <= -1 is true */
> +if (arg2 == -1) {
> +tcg_out_movi(s, TCG_TYPE_REG, ret, !(flags & 
> SETCOND_INV));
> +return ret;
> +}
> +cond = TCG_COND_LTU;
> +} else {
> +cond = TCG_COND_LT;
> +}
> +tcg_debug_assert(arg2 <= 0x7ff);
> +if (++arg2 == 0x800) {
> +tcg_out_movi(s, TCG_TYPE_REG, TCG_REG_TMP0, arg2);
> +arg2 = TCG_REG_TMP0;
> +c2 = false;
> +}
> +} else {
> +TCGReg tmp = arg2;
> +arg2 = arg1;
> +arg1 = tmp;
> +cond = tcg_swap_cond(cond);/* LE -> GE */
> +cond = tcg_invert_cond(cond);  /* GE -> LT */
> +flags ^= SETCOND_INV;
> +}
> +break;
> +default:
> +break;
> +}
> +
> +switch (cond) {
> +case TCG_COND_NE:
> +flags |= SETCOND_NEZ;
> +if (!c2) {
> +tcg_out_opc_reg(s, OPC_XOR, ret, arg1, arg2);
> +} else if (arg2 == 0) {
> +ret = arg1;
> +} else {
> +tcg_out_opc_reg(s, OPC_XORI, ret, arg1, arg2);
> +}
> +break;
> +
> +case TCG_COND_LT:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_SLTI, ret, arg1, arg2);
> +} else {
> +tcg_out_op

Re: [PATCH v3 3/3] migration/doc: We broke backwards compatibility

2023-05-16 Thread Peter Xu

On Mon, May 15, 2023 at 10:32:01AM +0200, Juan Quintela wrote:
> When we detect that we have broken backwards compantibility in a
> released version, we can't do anything for that version.  But once we
> fix that bug on the next released version, we can "mitigate" that
> problem when migrating to new versions to give a way out of that
> machine until it does a hard reboot.
> 
> Signed-off-by: Juan Quintela 
> ---
>  docs/devel/migration.rst | 194 +++
>  1 file changed, 194 insertions(+)
> 
> diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
> index 95e797ee60..97b6f48474 100644
> --- a/docs/devel/migration.rst
> +++ b/docs/devel/migration.rst
> @@ -451,6 +451,200 @@ binary in both sides of the migration.  If we use 
> different QEMU
>  versions process, then we need to have into account all other
>  differences and the examples become even more complicated.
>  
> +How to mitigate when we have a backward compatibility error
> +---
> +
> +We broke migration for old machine types continously during

continuously

> +development.  But as soon as we find that there is a problem, we fix
> +it.  The problem is what happens when we detect after we have done a
> +release that something has gone wrong.
> +
> +Let see how it worked with one example.
> +
> +After the release of qemu-8.0 we found a problem when doing migration
> +of the machine type pc-7.2.
> +
> +- $ qemu-7.2 -M pc-7.2  ->  qemu-7.2 -M pc-7.2
> +
> +  This migration works
> +
> +- $ qemu-8.0 -M pc-7.2  ->  qemu-8.0 -M pc-7.2
> +
> +  This migration works
> +
> +- $ qemu-8.0 -M pc-7.2  ->  qemu-7.2 -M pc-7.2
> +
> +  This migration fails
> +
> +- $ qemu-7.2 -M pc-7.2  ->  qemu-8.0 -M pc-7.2
> +
> +  This migration fails
> +
> +So clearly something fails when migration between qemu-7.2 and
> +qemu-8.0 with machine type pc-7.2.  The error messages, and git bisect
> +pointed to this commit.
> +
> +In qemu-8.0 we got this commit: ::
> +
> +commit 9a6ef182c03eaa138bae553f0fbb5a123bef9a53
> +Author: Jonathan Cameron 
> +Date:   Thu Mar 2 13:37:03 2023 +
> +
> +hw/pci/aer: Add missing routing for AER errors

Worst timing ever for him.. :(

The lesson is never break migration when the maintainer has any intention
to add some docs explaining backward compatibility.

> +
> +The relevant bits of the commit for our example are this ones:
> +
> +--- a/hw/pci/pcie_aer.c
> ++++ b/hw/pci/pcie_aer.c
> +@@ -112,6 +112,10 @@ int pcie_aer_init(PCIDevice *dev,
> +
> + pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
> +  PCI_ERR_UNC_SUPPORTED);
> ++pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
> ++ PCI_ERR_UNC_MASK_DEFAULT);
> ++pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
> ++ PCI_ERR_UNC_SUPPORTED);
> +
> + pci_set_long(dev->config + offset + PCI_ERR_UNCOR_SEVER,
> + PCI_ERR_UNC_SEVERITY_DEFAULT);
> +
> +The patch changes how we configure pci space for AER.  But qemu fails
> +when the pci space configuration is different betwwen source and
> +destination.
> +
> +The following commit show how this got fixed:
> +
> +
> +
> +The relevant parts of the fix are as follow:
> +
> +First, we create a new property for the device to be able to configure
> +the old behaviour or the new behaviour. ::
> +
> +diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> +index 8a87ccc8b0..5153ad63d6 100644
> +--- a/hw/pci/pci.c
> ++++ b/hw/pci/pci.c
> +@@ -79,6 +79,8 @@ static Property pci_props[] = {
> + DEFINE_PROP_STRING("failover_pair_id", PCIDevice,
> +failover_pair_id),
> + DEFINE_PROP_UINT32("acpi-index",  PCIDevice, acpi_index, 0),
> ++DEFINE_PROP_BIT("x-pcie-err-unc-mask", PCIDevice, cap_present,
> ++QEMU_PCIE_ERR_UNC_MASK_BITNR, true),
> + DEFINE_PROP_END_OF_LIST()
> + };
> +
> +Notice that we enable te feature for new machine types.

the

> +
> +Now we see how the fix is done.  This is going to depend on what kind
> +of breakage happens, but in this case it is quite simple. ::
> +
> +diff --git a/hw/pci/pcie_aer.c b/hw/pci/pcie_aer.c
> +index 103667c368..374d593ead 100644
> +--- a/hw/pci/pcie_aer.c
> ++++ b/hw/pci/pcie_aer.c
> +@@ -112,10 +112,13 @@ int pcie_aer_init(PCIDevice *dev, uint8_t cap_ver,
> +uint16_t offset,
> +
> + pci_set_long(dev->w1cmask + offset + PCI_ERR_UNCOR_STATUS,
> +  PCI_ERR_UNC_SUPPORTED);
> +-pci_set_long(dev->config + offset + PCI_ERR_UNCOR_MASK,
> +- PCI_ERR_UNC_MASK_DEFAULT);
> +-pci_set_long(dev->wmask + offset + PCI_ERR_UNCOR_MASK,
> +- PCI_ERR_UNC_SUPPORTED);
> ++
> ++if (dev->cap_present & QEMU_PCIE_ERR_UNC_MASK) {
> ++pci_set_long(dev->config + o

Re: [PATCH 08/11] tcg/riscv: Support CPOP from Zbb

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:58 PM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.h | 4 ++--
>  tcg/riscv/tcg-target.c.inc | 9 +
>  2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index 8e327afc3a..e0b23006c4 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -127,7 +127,7 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_nor_i32  0
>  #define TCG_TARGET_HAS_clz_i32  0
>  #define TCG_TARGET_HAS_ctz_i32  0
> -#define TCG_TARGET_HAS_ctpop_i320
> +#define TCG_TARGET_HAS_ctpop_i32have_zbb
>  #define TCG_TARGET_HAS_brcond2  1
>  #define TCG_TARGET_HAS_setcond2 1
>  #define TCG_TARGET_HAS_qemu_st8_i32 0
> @@ -161,7 +161,7 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_nor_i64  0
>  #define TCG_TARGET_HAS_clz_i64  0
>  #define TCG_TARGET_HAS_ctz_i64  0
> -#define TCG_TARGET_HAS_ctpop_i640
> +#define TCG_TARGET_HAS_ctpop_i64have_zbb
>  #define TCG_TARGET_HAS_add2_i64 1
>  #define TCG_TARGET_HAS_sub2_i64 1
>  #define TCG_TARGET_HAS_mulu2_i640
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 9cbefb2833..044ddfb160 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -1512,6 +1512,13 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  }
>  break;
>
> +case INDEX_op_ctpop_i32:
> +tcg_out_opc_imm(s, OPC_CPOPW, a0, a1, 0);
> +break;
> +case INDEX_op_ctpop_i64:
> +tcg_out_opc_imm(s, OPC_CPOP, a0, a1, 0);
> +break;
> +
>  case INDEX_op_add2_i32:
>  tcg_out_addsub2(s, a0, a1, a2, args[3], args[4], args[5],
>  const_args[4], const_args[5], false, true);
> @@ -1634,6 +1641,8 @@ static TCGConstraintSetIndex 
> tcg_target_op_def(TCGOpcode op)
>  case INDEX_op_bswap16_i64:
>  case INDEX_op_bswap32_i64:
>  case INDEX_op_bswap64_i64:
> +case INDEX_op_ctpop_i32:
> +case INDEX_op_ctpop_i64:
>  return C_O1_I1(r, r);
>
>  case INDEX_op_st8_i32:
> --
> 2.34.1
>
>

Re: [PATCH v3 2/3] migration/docs: How to migrate when hosts have different features

2023-05-16 Thread Peter Xu

On Mon, May 15, 2023 at 10:32:00AM +0200, Juan Quintela wrote:
> Sometimes devices have different features depending of things outside
> of qemu.  For instance the kernel.  Document how to handle that cases.
> 
> Signed-off-by: Juan Quintela 
> 
> ---
> 
> If you have some example to put here, I am all ears.  I guess that
> virtio-* with some features that are on qemu but not on all kernel
> would do the trick, but I am not a virtio guru myself.  Patches
> welcome.
> ---
>  docs/devel/migration.rst | 93 
>  1 file changed, 93 insertions(+)
> 
> diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
> index b4c4f3ec35..95e797ee60 100644
> --- a/docs/devel/migration.rst
> +++ b/docs/devel/migration.rst
> @@ -357,6 +357,99 @@ machine types to have the right value: ::
>   ...
>   };
>  
> +A device with diferent features on both sides
> +-
> +
> +Let's assume that we are using the same QEMU binary on both sides,
> +just to make the things easier.  But we have a device that has
> +different features on both sides of the migration.  That can be
> +because the devices are different, because the kernel driver of both
> +devices have different features, whatever.
> +
> +How can we get this to work with migration.  The way to do that is
> +"theoretically" easy.  You have to get the features that the device
> +has in the source of the migration.  The features that the device has
> +on the target of the migration, you get the intersection of the
> +features of both sides, and that is the way that you should launch
> +qemu.
> +
> +Notice that this is not completely related to qemu.  The most
> +important thing here is that this should be handle by the managing
> +application that launches qemu.  If qemu is configured correctly, the
> +migration will suceeed.
> +
> +Once that we have defined that, doing this is complicated.  Almost all
> +devices are bad at being able to be launched with only some features
> +enabled.  With one big exception: cpus.
> +
> +You can read the documentation for QEMU x86 cpu models here:
> +
> +https://qemu-project.gitlab.io/qemu/system/qemu-cpu-models.html
> +
> +See when they talk about migration they recommend that one chooses the
> +newest cpu model that is supported for all cpus.
> +
> +Let's say that we have:
> +
> +Host A:
> +
> +Device X has the feature Y
> +
> +Host B:
> +
> +Device X has not the feature Y
> +
> +If we try to migrate without any care from host A to host B, it will
> +fail because when migration tries to load the feature Y on
> +destination, it will find that the hardware is not there.
> +
> +Doing this would be the equivalent of doing with cpus:
> +
> +Host A:
> +
> +$ qemu-system-x86_64 -cpu host
> +
> +Host B:
> +
> +$ qemu-system-x86_64 -cpu host
> +
> +When both hosts have different cpu features this is waranteed to fail.
> +Especially if Host B has less features than host A.  If host A has
> +less features than host B, sometimes it works.  Important word of last
> +sentence is "sometimes".
> +
> +So, forgetting about cpu models and continuing with the -cpu host
> +example, let's see that the differences of the cpus is that Host A and
> +B have the following features:
> +
> +Features:   'pcid'  'stibp' 'taa-no'
> +Host A:X   X
> +Host B:X
> +
> +And we want to migrate between them, the way configure both qemu cpu
> +will be:
> +
> +Host A:
> +
> +$ qemu-system-x86_64 -cpu host,pcid=off,stibp=off
> +
> +Host B:
> +
> +$ qemu-system-x86_64 -cpu host,taa-no=off

Since we're using cpu as example, shall we at least mention at the end that
we don't suggest using -cpu host if migration is needed?

> +
> +And you would be able to migrate between them.  It is responsability
> +of the management application or of the user to make sure that the
> +configuration is correct.  QEMU don't know how to look at this kind of
> +features in general.
> +
> +Other devices have worse control about individual features.  If they
> +want to be able to migrate between hosts that show different features,
> +the device needs a way to configure which ones it is going to use.
> +
> +In this section we have considered that we are using the same QEMU
> +binary in both sides of the migration.  If we use different QEMU
> +versions process, then we need to have into account all other
> +differences and the examples become even more complicated.

Mostly good to me.  What I worry is how much help this will bring to
developers - I'd assume developers working on these will be aware of this.
But I guess it's always good to have any documentation than nothing.

Acked-by: Peter Xu 

-- 
Peter Xu

Re: [PATCH 07/11] tcg/riscv: Support REV8 from Zbb

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:59 PM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.h | 10 +-
>  tcg/riscv/tcg-target.c.inc | 29 +
>  2 files changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index 317d385924..8e327afc3a 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -116,8 +116,8 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_ext16s_i32   1
>  #define TCG_TARGET_HAS_ext8u_i321
>  #define TCG_TARGET_HAS_ext16u_i32   1
> -#define TCG_TARGET_HAS_bswap16_i32  0
> -#define TCG_TARGET_HAS_bswap32_i32  0
> +#define TCG_TARGET_HAS_bswap16_i32  have_zbb
> +#define TCG_TARGET_HAS_bswap32_i32  have_zbb
>  #define TCG_TARGET_HAS_not_i32  1
>  #define TCG_TARGET_HAS_neg_i32  1
>  #define TCG_TARGET_HAS_andc_i32 have_zbb
> @@ -149,9 +149,9 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_ext8u_i641
>  #define TCG_TARGET_HAS_ext16u_i64   1
>  #define TCG_TARGET_HAS_ext32u_i64   1
> -#define TCG_TARGET_HAS_bswap16_i64  0
> -#define TCG_TARGET_HAS_bswap32_i64  0
> -#define TCG_TARGET_HAS_bswap64_i64  0
> +#define TCG_TARGET_HAS_bswap16_i64  have_zbb
> +#define TCG_TARGET_HAS_bswap32_i64  have_zbb
> +#define TCG_TARGET_HAS_bswap64_i64  have_zbb
>  #define TCG_TARGET_HAS_not_i64  1
>  #define TCG_TARGET_HAS_neg_i64  1
>  #define TCG_TARGET_HAS_andc_i64 have_zbb
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 58f969b4fe..9cbefb2833 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -1488,6 +1488,30 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  }
>  break;
>
> +case INDEX_op_bswap64_i64:
> +tcg_out_opc_imm(s, OPC_REV8, a0, a1, 0);
> +break;
> +case INDEX_op_bswap32_i32:
> +a2 = 0;
> +/* fall through */
> +case INDEX_op_bswap32_i64:
> +tcg_out_opc_imm(s, OPC_REV8, a0, a1, 0);
> +if (a2 & TCG_BSWAP_OZ) {
> +tcg_out_opc_imm(s, OPC_SRLI, a0, a0, 32);
> +} else {
> +tcg_out_opc_imm(s, OPC_SRAI, a0, a0, 32);
> +}
> +break;
> +case INDEX_op_bswap16_i64:
> +case INDEX_op_bswap16_i32:
> +tcg_out_opc_imm(s, OPC_REV8, a0, a1, 0);
> +if (a2 & TCG_BSWAP_OZ) {
> +tcg_out_opc_imm(s, OPC_SRLI, a0, a0, 48);
> +} else {
> +tcg_out_opc_imm(s, OPC_SRAI, a0, a0, 48);
> +}
> +break;
> +
>  case INDEX_op_add2_i32:
>  tcg_out_addsub2(s, a0, a1, a2, args[3], args[4], args[5],
>  const_args[4], const_args[5], false, true);
> @@ -1605,6 +1629,11 @@ static TCGConstraintSetIndex 
> tcg_target_op_def(TCGOpcode op)
>  case INDEX_op_extrl_i64_i32:
>  case INDEX_op_extrh_i64_i32:
>  case INDEX_op_ext_i32_i64:
> +case INDEX_op_bswap16_i32:
> +case INDEX_op_bswap32_i32:
> +case INDEX_op_bswap16_i64:
> +case INDEX_op_bswap32_i64:
> +case INDEX_op_bswap64_i64:
>  return C_O1_I1(r, r);
>
>  case INDEX_op_st8_i32:
> --
> 2.34.1
>
>

Re: [PATCH 06/11] tcg/riscv: Support rotates from Zbb

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:57 PM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.h |  4 ++--
>  tcg/riscv/tcg-target.c.inc | 34 ++
>  2 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index 9f58d46208..317d385924 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -101,7 +101,7 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_div_i32  1
>  #define TCG_TARGET_HAS_rem_i32  1
>  #define TCG_TARGET_HAS_div2_i32 0
> -#define TCG_TARGET_HAS_rot_i32  0
> +#define TCG_TARGET_HAS_rot_i32  have_zbb
>  #define TCG_TARGET_HAS_deposit_i32  0
>  #define TCG_TARGET_HAS_extract_i32  0
>  #define TCG_TARGET_HAS_sextract_i32 0
> @@ -136,7 +136,7 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_div_i64  1
>  #define TCG_TARGET_HAS_rem_i64  1
>  #define TCG_TARGET_HAS_div2_i64 0
> -#define TCG_TARGET_HAS_rot_i64  0
> +#define TCG_TARGET_HAS_rot_i64  have_zbb
>  #define TCG_TARGET_HAS_deposit_i64  0
>  #define TCG_TARGET_HAS_extract_i64  0
>  #define TCG_TARGET_HAS_sextract_i64 0
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index f64eaa8515..58f969b4fe 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -1458,6 +1458,36 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  }
>  break;
>
> +case INDEX_op_rotl_i32:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_RORIW, a0, a1, -a2 & 0x1f);
> +} else {
> +tcg_out_opc_reg(s, OPC_ROLW, a0, a1, a2);
> +}
> +break;
> +case INDEX_op_rotl_i64:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_RORI, a0, a1, -a2 & 0x3f);
> +} else {
> +tcg_out_opc_reg(s, OPC_ROL, a0, a1, a2);
> +}
> +break;
> +
> +case INDEX_op_rotr_i32:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_RORIW, a0, a1, a2 & 0x1f);
> +} else {
> +tcg_out_opc_reg(s, OPC_RORW, a0, a1, a2);
> +}
> +break;
> +case INDEX_op_rotr_i64:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_RORI, a0, a1, a2 & 0x3f);
> +} else {
> +tcg_out_opc_reg(s, OPC_ROR, a0, a1, a2);
> +}
> +break;
> +
>  case INDEX_op_add2_i32:
>  tcg_out_addsub2(s, a0, a1, a2, args[3], args[4], args[5],
>  const_args[4], const_args[5], false, true);
> @@ -1629,9 +1659,13 @@ static TCGConstraintSetIndex 
> tcg_target_op_def(TCGOpcode op)
>  case INDEX_op_shl_i32:
>  case INDEX_op_shr_i32:
>  case INDEX_op_sar_i32:
> +case INDEX_op_rotl_i32:
> +case INDEX_op_rotr_i32:
>  case INDEX_op_shl_i64:
>  case INDEX_op_shr_i64:
>  case INDEX_op_sar_i64:
> +case INDEX_op_rotl_i64:
> +case INDEX_op_rotr_i64:
>  return C_O1_I2(r, r, ri);
>
>  case INDEX_op_brcond_i32:
> --
> 2.34.1
>
>

Re: [PATCH 05/11] tcg/riscv: Use ADD.UW for guest address generation

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:58 PM Richard Henderson
 wrote:
>
> The instruction is a combined zero-extend and add.
> Use it for exactly that.
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.c.inc | 33 ++---
>  1 file changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 53a7f97b29..f64eaa8515 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -1039,14 +1039,18 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
> *s, TCGReg *pbase,
>  tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP0, TCG_REG_TMP1, 0);
>
>  /* TLB Hit - translate address using addend.  */
> -addr_adj = addr_reg;
> -if (TARGET_LONG_BITS == 32) {
> -addr_adj = TCG_REG_TMP0;
> -tcg_out_ext32u(s, addr_adj, addr_reg);
> +if (TARGET_LONG_BITS == 64) {
> +tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, addr_reg, TCG_REG_TMP2);
> +} else if (have_zba) {
> +tcg_out_opc_reg(s, OPC_ADD_UW, TCG_REG_TMP0, addr_reg, TCG_REG_TMP2);
> +} else {
> +tcg_out_ext32u(s, TCG_REG_TMP0, addr_reg);
> +tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP0, 
> TCG_REG_TMP2);
>  }
> -tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_REG_TMP2, addr_adj);
>  *pbase = TCG_REG_TMP0;
>  #else
> +TCGReg base;
> +
>  if (a_mask) {
>  ldst = new_ldst_label(s);
>  ldst->is_ld = is_ld;
> @@ -1061,14 +1065,21 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
> *s, TCGReg *pbase,
>  tcg_out_opc_branch(s, OPC_BNE, TCG_REG_TMP1, TCG_REG_ZERO, 0);
>  }
>
> -TCGReg base = addr_reg;
> -if (TARGET_LONG_BITS == 32) {
> -tcg_out_ext32u(s, TCG_REG_TMP0, base);
> -base = TCG_REG_TMP0;
> -}
>  if (guest_base != 0) {
> -tcg_out_opc_reg(s, OPC_ADD, TCG_REG_TMP0, TCG_GUEST_BASE_REG, base);
>  base = TCG_REG_TMP0;
> +if (TARGET_LONG_BITS == 64) {
> +tcg_out_opc_reg(s, OPC_ADD, base, addr_reg, TCG_GUEST_BASE_REG);
> +} else if (have_zba) {
> +tcg_out_opc_reg(s, OPC_ADD_UW, base, addr_reg, 
> TCG_GUEST_BASE_REG);
> +} else {
> +tcg_out_ext32u(s, base, addr_reg);
> +tcg_out_opc_reg(s, OPC_ADD, base, base, TCG_GUEST_BASE_REG);
> +}
> +} else if (TARGET_LONG_BITS == 64) {
> +base = addr_reg;
> +} else {
> +base = TCG_REG_TMP0;
> +tcg_out_ext32u(s, base, addr_reg);
>  }
>  *pbase = base;
>  #endif
> --
> 2.34.1
>
>

Re: [PATCH 04/11] tcg/riscv: Support ADD.UW, SEXT.B, SEXT.H, ZEXT.H from Zba+Zbb

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:59 PM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.c.inc | 32 
>  1 file changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index c5b060023f..53a7f97b29 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -593,26 +593,42 @@ static void tcg_out_ext8u(TCGContext *s, TCGReg ret, 
> TCGReg arg)
>
>  static void tcg_out_ext16u(TCGContext *s, TCGReg ret, TCGReg arg)
>  {
> -tcg_out_opc_imm(s, OPC_SLLIW, ret, arg, 16);
> -tcg_out_opc_imm(s, OPC_SRLIW, ret, ret, 16);
> +if (have_zbb) {
> +tcg_out_opc_reg(s, OPC_ZEXT_H, ret, arg, TCG_REG_ZERO);
> +} else {
> +tcg_out_opc_imm(s, OPC_SLLIW, ret, arg, 16);
> +tcg_out_opc_imm(s, OPC_SRLIW, ret, ret, 16);
> +}
>  }
>
>  static void tcg_out_ext32u(TCGContext *s, TCGReg ret, TCGReg arg)
>  {
> -tcg_out_opc_imm(s, OPC_SLLI, ret, arg, 32);
> -tcg_out_opc_imm(s, OPC_SRLI, ret, ret, 32);
> +if (have_zba) {
> +tcg_out_opc_reg(s, OPC_ADD_UW, ret, arg, TCG_REG_ZERO);
> +} else {
> +tcg_out_opc_imm(s, OPC_SLLI, ret, arg, 32);
> +tcg_out_opc_imm(s, OPC_SRLI, ret, ret, 32);
> +}
>  }
>
>  static void tcg_out_ext8s(TCGContext *s, TCGType type, TCGReg ret, TCGReg 
> arg)
>  {
> -tcg_out_opc_imm(s, OPC_SLLIW, ret, arg, 24);
> -tcg_out_opc_imm(s, OPC_SRAIW, ret, ret, 24);
> +if (have_zbb) {
> +tcg_out_opc_imm(s, OPC_SEXT_B, ret, arg, 0);
> +} else {
> +tcg_out_opc_imm(s, OPC_SLLIW, ret, arg, 24);
> +tcg_out_opc_imm(s, OPC_SRAIW, ret, ret, 24);
> +}
>  }
>
>  static void tcg_out_ext16s(TCGContext *s, TCGType type, TCGReg ret, TCGReg 
> arg)
>  {
> -tcg_out_opc_imm(s, OPC_SLLIW, ret, arg, 16);
> -tcg_out_opc_imm(s, OPC_SRAIW, ret, ret, 16);
> +if (have_zbb) {
> +tcg_out_opc_imm(s, OPC_SEXT_H, ret, arg, 0);
> +} else {
> +tcg_out_opc_imm(s, OPC_SLLIW, ret, arg, 16);
> +tcg_out_opc_imm(s, OPC_SRAIW, ret, ret, 16);
> +}
>  }
>
>  static void tcg_out_ext32s(TCGContext *s, TCGReg ret, TCGReg arg)
> --
> 2.34.1
>
>

Re: [PATCH v3 1/3] migration: Add documentation for backwards compatiblity

2023-05-16 Thread Peter Xu

On Mon, May 15, 2023 at 10:31:59AM +0200, Juan Quintela wrote:
> State what are the requeriments to get migration working between qemu
> versions.  And once there explain how one is supposed to implement a
> new feature/default value and not break migration.
> 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> Message-Id: <20230511082701.12828-1-quint...@redhat.com>
> Signed-off-by: Juan Quintela 
> ---
>  docs/devel/migration.rst | 216 +++
>  1 file changed, 216 insertions(+)
> 
> diff --git a/docs/devel/migration.rst b/docs/devel/migration.rst
> index 6f65c23b47..b4c4f3ec35 100644
> --- a/docs/devel/migration.rst
> +++ b/docs/devel/migration.rst
> @@ -142,6 +142,222 @@ General advice for device developers
>may be different on the destination.  This can result in the
>device state being loaded into the wrong device.
>  
> +How backwards compatibility works
> +-
> +
> +When we do migration, we have to QEMU process: the source and the

s/to/two/, s/process/processes/

> +target.  There are two cases, they are the same version or they are a
> +different version.

s/a different version/different versions/

> +The easy case is when they are the same version.
> +The difficult one is when they are different versions.
> +
> +There are two things that are different, but they have very similar
> +names and sometimes get confused:

(space)

> +- QEMU version
> +- machine version

It's normally called "machine type", so maybe use that?  Or just "machine
version / machine type"?

> +
> +Let's start with a practical example, we start with:
> +
> +- qemu-system-x86_64 (v5.2), from now on qemu-5.2.
> +- qemu-system-x86_64 (v5.1), from now on qemu-5.1.
> +
> +Related to this are the "latest" machine types defined on each of
> +them:
> +
> +- pc-q35-5.2 (newer one in qemu-5.2) from now on pc-5.2
> +- pc-q35-5.1 (newer one in qemu-5.1) from now on pc-5.1
> +
> +First of all, migration is only supposed to work if you use the same
> +machine type in both source and destination. The QEMU hardware
> +configuration needs to be the same also on source and destination.
> +Most aspects of the backend configuration can be changed at will,
> +except for a few cases where the backend features influence frontend
> +device feature exposure.  But that is not relevant for this section.
> +
> +I am going to list the number of combinations that we can have.  Let's
> +start with the trivial ones, QEMU is the same on source and
> +destination:
> +
> +1 - qemu-5.2 -M pc-5.2  -> migrates to -> qemu-5.2 -M pc-5.2
> +
> +  This is the latest QEMU with the latest machine type.
> +  This have to work, and if it doesn't work it is a bug.
> +
> +2 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
> +
> +  Exactly the same case than the previous one, but for 5.1.
> +  Nothing to see here either.
> +
> +This are the easiest ones, we will not talk more about them in this
> +section.
> +
> +Now we start with the more interesting cases.  Consider the case where
> +we have the same QEMU version in both sides (qemu-5.2) but we are using
> +the latest machine type for that version (pc-5.2) but one of an older
> +QEMU version, in this case pc-5.1.
> +
> +3 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
> +
> +  It needs to use the definition of pc-5.1 and the devices as they
> +  were configured on 5.1, but this should be easy in the sense that
> +  both sides are the same QEMU and both sides have exactly the same
> +  idea of what the pc-5.1 machine is.
> +
> +4 - qemu-5.1 -M pc-5.2  -> migrates to -> qemu-5.1 -M pc-5.2
> +
> +  This combination is not possible as the qemu-5.1 doen't understand
> +  pc-5.2 machine type.  So nothing to worry here.
> +
> +Now it comes the interesting ones, when both QEMU processes are
> +different.  Notice also that the machine type needs to be pc-5.1,
> +because we have the limitation than qemu-5.1 doesn't know pc-5.2.  So
> +the possible cases are:
> +
> +5 - qemu-5.2 -M pc-5.1  -> migrates to -> qemu-5.1 -M pc-5.1
> +
> +  This migration is known as newer to older.  We need to make sure
> +  when we are developing 5.2 we need to take care about not to break
> +  migration to qemu-5.1.  Notice that we can't make updates to
> +  qemu-5.1 to understand whatever qemu-5.2 decides to change, so it is
> +  in qemu-5.2 side to make the relevant changes.
> +
> +6 - qemu-5.1 -M pc-5.1  -> migrates to -> qemu-5.2 -M pc-5.1
> +
> +  This migration is known as older to newer.  We need to make sure
> +  than we are able to receive migrations from qemu-5.1. The problem is
> +  similar to the previous one.
> +
> +If qemu-5.1 and qemu-5.2 were the same, there will not be any
> +compatibility problems.  But the reason that we create qemu-5.2 is to
> +get new features, devices, defaults, etc.
> +
> +If we get a device that has a new feature, or change a default value,
> +we have a problem when we try to migrate between different QEMU
> +versions.
> +

Re: [PATCH 03/11] tcg/riscv: Support ANDN, ORN, XNOR from Zbb

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:58 PM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target-con-set.h |  1 +
>  tcg/riscv/tcg-target-con-str.h |  1 +
>  tcg/riscv/tcg-target.h | 12 +-
>  tcg/riscv/tcg-target.c.inc | 41 ++
>  4 files changed, 49 insertions(+), 6 deletions(-)
>
> diff --git a/tcg/riscv/tcg-target-con-set.h b/tcg/riscv/tcg-target-con-set.h
> index d8d3ac..1a33ece98f 100644
> --- a/tcg/riscv/tcg-target-con-set.h
> +++ b/tcg/riscv/tcg-target-con-set.h
> @@ -15,6 +15,7 @@ C_O0_I2(rZ, rZ)
>  C_O1_I1(r, r)
>  C_O1_I2(r, r, ri)
>  C_O1_I2(r, r, rI)
> +C_O1_I2(r, r, rJ)
>  C_O1_I2(r, rZ, rN)
>  C_O1_I2(r, rZ, rZ)
>  C_O2_I4(r, r, rZ, rZ, rM, rM)
> diff --git a/tcg/riscv/tcg-target-con-str.h b/tcg/riscv/tcg-target-con-str.h
> index 6f1cfb976c..d5c419dff1 100644
> --- a/tcg/riscv/tcg-target-con-str.h
> +++ b/tcg/riscv/tcg-target-con-str.h
> @@ -15,6 +15,7 @@ REGS('r', ALL_GENERAL_REGS)
>   * CONST(letter, TCG_CT_CONST_* bit set)
>   */
>  CONST('I', TCG_CT_CONST_S12)
> +CONST('J', TCG_CT_CONST_J12)
>  CONST('N', TCG_CT_CONST_N12)
>  CONST('M', TCG_CT_CONST_M12)
>  CONST('Z', TCG_CT_CONST_ZERO)
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index 863ac8ba2f..9f58d46208 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -120,9 +120,9 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_bswap32_i32  0
>  #define TCG_TARGET_HAS_not_i32  1
>  #define TCG_TARGET_HAS_neg_i32  1
> -#define TCG_TARGET_HAS_andc_i32 0
> -#define TCG_TARGET_HAS_orc_i32  0
> -#define TCG_TARGET_HAS_eqv_i32  0
> +#define TCG_TARGET_HAS_andc_i32 have_zbb
> +#define TCG_TARGET_HAS_orc_i32  have_zbb
> +#define TCG_TARGET_HAS_eqv_i32  have_zbb
>  #define TCG_TARGET_HAS_nand_i32 0
>  #define TCG_TARGET_HAS_nor_i32  0
>  #define TCG_TARGET_HAS_clz_i32  0
> @@ -154,9 +154,9 @@ extern bool have_zbb;
>  #define TCG_TARGET_HAS_bswap64_i64  0
>  #define TCG_TARGET_HAS_not_i64  1
>  #define TCG_TARGET_HAS_neg_i64  1
> -#define TCG_TARGET_HAS_andc_i64 0
> -#define TCG_TARGET_HAS_orc_i64  0
> -#define TCG_TARGET_HAS_eqv_i64  0
> +#define TCG_TARGET_HAS_andc_i64 have_zbb
> +#define TCG_TARGET_HAS_orc_i64  have_zbb
> +#define TCG_TARGET_HAS_eqv_i64  have_zbb
>  #define TCG_TARGET_HAS_nand_i64 0
>  #define TCG_TARGET_HAS_nor_i64  0
>  #define TCG_TARGET_HAS_clz_i64  0
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 49ff9c8b9d..c5b060023f 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -138,6 +138,7 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
> kind, int slot)
>  #define TCG_CT_CONST_S12   0x200
>  #define TCG_CT_CONST_N12   0x400
>  #define TCG_CT_CONST_M12   0x800
> +#define TCG_CT_CONST_J12  0x1000
>
>  #define ALL_GENERAL_REGS   MAKE_64BIT_MASK(0, 32)
>
> @@ -174,6 +175,13 @@ static bool tcg_target_const_match(int64_t val, TCGType 
> type, int ct)
>  if ((ct & TCG_CT_CONST_M12) && val >= -0x7ff && val <= 0x7ff) {
>  return 1;
>  }
> +/*
> + * Inverse of sign extended from 12 bits: ~[-0x800, 0x7ff].
> + * Used to map ANDN back to ANDI, etc.
> + */
> +if ((ct & TCG_CT_CONST_J12) && ~val >= -0x800 && ~val <= 0x7ff) {
> +return 1;
> +}
>  return 0;
>  }
>
> @@ -1306,6 +1314,31 @@ static void tcg_out_op(TCGContext *s, TCGOpcode opc,
>  }
>  break;
>
> +case INDEX_op_andc_i32:
> +case INDEX_op_andc_i64:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_ANDI, a0, a1, ~a2);
> +} else {
> +tcg_out_opc_reg(s, OPC_ANDN, a0, a1, a2);
> +}
> +break;
> +case INDEX_op_orc_i32:
> +case INDEX_op_orc_i64:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_ORI, a0, a1, ~a2);
> +} else {
> +tcg_out_opc_reg(s, OPC_ORN, a0, a1, a2);
> +}
> +break;
> +case INDEX_op_eqv_i32:
> +case INDEX_op_eqv_i64:
> +if (c2) {
> +tcg_out_opc_imm(s, OPC_XORI, a0, a1, ~a2);
> +} else {
> +tcg_out_opc_reg(s, OPC_XNOR, a0, a1, a2);
> +}
> +break;
> +
>  case INDEX_op_not_i32:
>  case INDEX_op_not_i64:
>  tcg_out_opc_imm(s, OPC_XORI, a0, a1, -1);
> @@ -1536,6 +1569,14 @@ static TCGConstraintSetIndex 
> tcg_target_op_def(TCGOpcode op)
>  case INDEX_op_xor_i64:
>  return C_O1_I2(r, r, rI);
>
> +case INDEX_op_andc_i32:
> +case INDEX_op_andc_i64:
> +case INDEX_op_orc_i32:
> +case INDEX_op_orc_i64:
> +case INDEX_op_eqv_i32:
> +case INDEX_op_eqv_i64:
> +return C_O1_I2(r, r, rJ);
> +
>  case INDEX_op_sub_i32:
>  case INDEX_op_sub_i64:
>  return C_O1_I2(r, rZ, rN

Re: [PATCH 02/11] tcg/riscv: Probe for Zba, Zbb, Zicond extensions

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:59 PM Richard Henderson
 wrote:
>
> Define a useful subset of the extensions.  Probe for them
> via compiler pre-processor feature macros and SIGILL.
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  tcg/riscv/tcg-target.h |  6 +++
>  tcg/riscv/tcg-target.c.inc | 96 ++
>  2 files changed, 102 insertions(+)
>
> diff --git a/tcg/riscv/tcg-target.h b/tcg/riscv/tcg-target.h
> index 494c986b49..863ac8ba2f 100644
> --- a/tcg/riscv/tcg-target.h
> +++ b/tcg/riscv/tcg-target.h
> @@ -90,6 +90,12 @@ typedef enum {
>  #define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_NORMAL
>  #define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL
>
> +#if defined(__riscv_arch_test) && defined(__riscv_zbb)
> +# define have_zbb true
> +#else
> +extern bool have_zbb;
> +#endif
> +
>  /* optional instructions */
>  #define TCG_TARGET_HAS_movcond_i32  0
>  #define TCG_TARGET_HAS_div_i32  1
> diff --git a/tcg/riscv/tcg-target.c.inc b/tcg/riscv/tcg-target.c.inc
> index 4dd33c73e8..49ff9c8b9d 100644
> --- a/tcg/riscv/tcg-target.c.inc
> +++ b/tcg/riscv/tcg-target.c.inc
> @@ -113,6 +113,20 @@ static const int tcg_target_call_iarg_regs[] = {
>  TCG_REG_A7,
>  };
>
> +#ifndef have_zbb
> +bool have_zbb;
> +#endif
> +#if defined(__riscv_arch_test) && defined(__riscv_zba)
> +# define have_zba true
> +#else
> +static bool have_zba;
> +#endif
> +#if defined(__riscv_arch_test) && defined(__riscv_zicond)
> +# define have_zicond true
> +#else
> +static bool have_zicond;
> +#endif
> +
>  static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind kind, int slot)
>  {
>  tcg_debug_assert(kind == TCG_CALL_RET_NORMAL);
> @@ -234,6 +248,34 @@ typedef enum {
>
>  OPC_FENCE = 0x000f,
>  OPC_NOP   = OPC_ADDI,   /* nop = addi r0,r0,0 */
> +
> +/* Zba: Bit manipulation extension, address generation */
> +OPC_ADD_UW = 0x083b,
> +
> +/* Zbb: Bit manipulation extension, basic bit manipulaton */
> +OPC_ANDN   = 0x40007033,
> +OPC_CLZ= 0x60001013,
> +OPC_CLZW   = 0x6000101b,
> +OPC_CPOP   = 0x60201013,
> +OPC_CPOPW  = 0x6020101b,
> +OPC_CTZ= 0x60101013,
> +OPC_CTZW   = 0x6010101b,
> +OPC_ORN= 0x40006033,
> +OPC_REV8   = 0x6b805013,
> +OPC_ROL= 0x60001033,
> +OPC_ROLW   = 0x6000103b,
> +OPC_ROR= 0x60005033,
> +OPC_RORW   = 0x6000503b,
> +OPC_RORI   = 0x60005013,
> +OPC_RORIW  = 0x6000501b,
> +OPC_SEXT_B = 0x60401013,
> +OPC_SEXT_H = 0x60501013,
> +OPC_XNOR   = 0x40004033,
> +OPC_ZEXT_H = 0x0800403b,
> +
> +/* Zicond: integer conditional operations */
> +OPC_CZERO_EQZ = 0x0e005033,
> +OPC_CZERO_NEZ = 0x0e007033,
>  } RISCVInsn;
>
>  /*
> @@ -1612,8 +1654,62 @@ static void tcg_target_qemu_prologue(TCGContext *s)
>  tcg_out_opc_imm(s, OPC_JALR, TCG_REG_ZERO, TCG_REG_RA, 0);
>  }
>
> +static volatile sig_atomic_t got_sigill;
> +
> +static void sigill_handler(int signo, siginfo_t *si, void *data)
> +{
> +/* Skip the faulty instruction */
> +ucontext_t *uc = (ucontext_t *)data;
> +uc->uc_mcontext.__gregs[REG_PC] += 4;
> +
> +got_sigill = 1;
> +}
> +
> +static void tcg_target_detect_isa(void)
> +{
> +#if !defined(have_zba) || !defined(have_zbb) || !defined(have_zicond)
> +/*
> + * TODO: It is expected that this will be determinable via
> + * linux riscv_hwprobe syscall, not yet merged.
> + * In the meantime, test via sigill.
> + */
> +
> +struct sigaction sa_old, sa_new;
> +
> +memset(&sa_new, 0, sizeof(sa_new));
> +sa_new.sa_flags = SA_SIGINFO;
> +sa_new.sa_sigaction = sigill_handler;
> +sigaction(SIGILL, &sa_new, &sa_old);
> +
> +#ifndef have_zba
> +/* Probe for Zba: add.uw zero,zero,zero. */
> +got_sigill = 0;
> +asm volatile(".insn %0" : : "i"(OPC_ADD_UW) : "memory");
> +have_zba = !got_sigill;
> +#endif
> +
> +#ifndef have_zbb
> +/* Probe for Zba: andn zero,zero,zero. */
> +got_sigill = 0;
> +asm volatile(".insn %0" : : "i"(OPC_ANDN) : "memory");
> +have_zbb = !got_sigill;
> +#endif
> +
> +#ifndef have_zicond
> +/* Probe for Zicond: czero.eqz zero,zero,zero. */
> +got_sigill = 0;
> +asm volatile(".insn %0" : : "i"(OPC_CZERO_EQZ) : "memory");
> +have_zicond = !got_sigill;
> +#endif
> +
> +sigaction(SIGILL, &sa_old, NULL);
> +#endif
> +}
> +
>  static void tcg_target_init(TCGContext *s)
>  {
> +tcg_target_detect_isa();
> +
>  tcg_target_available_regs[TCG_TYPE_I32] = 0x;
>  tcg_target_available_regs[TCG_TYPE_I64] = 0x;
>
> --
> 2.34.1
>
>

Re: [PATCH 01/11] disas/riscv: Decode czero.{eqz,nez}

2023-05-16 Thread Alistair Francis

On Wed, May 3, 2023 at 6:59 PM Richard Henderson
 wrote:
>
> Signed-off-by: Richard Henderson 

Acked-by: Alistair Francis 

Alistair

> ---
>  disas/riscv.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/disas/riscv.c b/disas/riscv.c
> index d6b0fbe5e8..c0a8b1006a 100644
> --- a/disas/riscv.c
> +++ b/disas/riscv.c
> @@ -935,6 +935,8 @@ typedef enum {
>  rv_op_vsetvli = 766,
>  rv_op_vsetivli = 767,
>  rv_op_vsetvl = 768,
> +rv_op_czero_eqz = 769,
> +rv_op_czero_nez = 770,
>  } rv_op;
>
>  /* structures */
> @@ -2066,7 +2068,9 @@ const rv_opcode_data opcode_data[] = {
>  { "vsext.vf8", rv_codec_v_r, rv_fmt_vd_vs2_vm, NULL, rv_op_vsext_vf8, 
> rv_op_vsext_vf8, 0 },
>  { "vsetvli", rv_codec_vsetvli, rv_fmt_vsetvli, NULL, rv_op_vsetvli, 
> rv_op_vsetvli, 0 },
>  { "vsetivli", rv_codec_vsetivli, rv_fmt_vsetivli, NULL, rv_op_vsetivli, 
> rv_op_vsetivli, 0 },
> -{ "vsetvl", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, rv_op_vsetvl, 
> rv_op_vsetvl, 0 }
> +{ "vsetvl", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, rv_op_vsetvl, 
> rv_op_vsetvl, 0 },
> +{ "czero.eqz", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
> +{ "czero.nez", rv_codec_r, rv_fmt_rd_rs1_rs2, NULL, 0, 0, 0 },
>  };
>
>  /* CSR names */
> @@ -2792,6 +2796,8 @@ static void decode_inst_opcode(rv_decode *dec, rv_isa 
> isa)
>  case 45: op = rv_op_minu; break;
>  case 46: op = rv_op_max; break;
>  case 47: op = rv_op_maxu; break;
> +case 075: op = rv_op_czero_eqz; break;
> +case 077: op = rv_op_czero_nez; break;
>  case 130: op = rv_op_sh1add; break;
>  case 132: op = rv_op_sh2add; break;
>  case 134: op = rv_op_sh3add; break;
> --
> 2.34.1
>
>

Re: [PATCH] target/arm: allow DC CVA[D]P in user mode emulation

2023-05-16 Thread Zhuojia Shen

On 05/16/2023 01:08 PM -0700, Richard Henderson wrote:
> On 5/15/23 20:59, Zhuojia Shen wrote:
> > DC CVAP and DC CVADP instructions can be executed in EL0 on Linux,
> > either directly when SCTLR_EL1.UCI == 1 or emulated by the kernel (see
> > user_cache_maint_handler() in arch/arm64/kernel/traps.c).  The Arm ARM
> > documents the semantics of the two instructions that they behave as
> > DC CVAC if the address pointed to by their register operand is not
> > persistent memory.
> > 
> > This patch enables execution of the two instructions in user mode
> > emulation as NOP while preserving their original emulation in full
> > system virtualization.
> > 
> > Signed-off-by: Zhuojia Shen 
> > ---
> >   target/arm/helper.c   | 26 +-
> >   tests/tcg/aarch64/Makefile.target | 11 
> >   tests/tcg/aarch64/dcpodp.c| 45 +++
> >   tests/tcg/aarch64/dcpop.c | 45 +++
> >   4 files changed, 120 insertions(+), 7 deletions(-)
> >   create mode 100644 tests/tcg/aarch64/dcpodp.c
> >   create mode 100644 tests/tcg/aarch64/dcpop.c
> > 
> > diff --git a/target/arm/helper.c b/target/arm/helper.c
> > index 0b7fd2e7e6..eeba5e7978 100644
> > --- a/target/arm/helper.c
> > +++ b/target/arm/helper.c
> > @@ -7432,23 +7432,37 @@ static void dccvap_writefn(CPUARMState *env, const 
> > ARMCPRegInfo *opaque,
> >   }
> >   }
> >   }
> > +#endif /*CONFIG_USER_ONLY*/
> >   static const ARMCPRegInfo dcpop_reg[] = {
> >   { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
> > .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
> > -  .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
> > +  .access = PL0_W,
> > .fgt = FGT_DCCVAP,
> > -  .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
> > +  .accessfn = aa64_cacheop_poc_access,
> > +#ifdef CONFIG_USER_ONLY
> > +  .type = ARM_CP_NOP,
> > +#else
> > +  .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
> > +  .writefn = dccvap_writefn,
> > +#endif
> > +},
> >   };
> 
> Not quite correct, as CVAP to an unmapped address should SIGSEGV.  That'll
> be done by the probe_read within dccvap_writefn.
> 
> Need to make dccvap_writefn always present, ifdef out only the
> memory_region_from_host + memory_region_writeback from there.  Need to set
> SCTLR_EL1.UCI in arm_cpu_reset_hold in the CONFIG_USER_ONLY block.

Thanks for the reviews; I'll update in v2.

> 
> 
> r~
> 
> >   static const ARMCPRegInfo dcpodp_reg[] = {
> >   { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
> > .opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
> > -  .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
> > +  .access = PL0_W,
> > .fgt = FGT_DCCVADP,
> > -  .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
> > +  .accessfn = aa64_cacheop_poc_access,
> > +#ifdef CONFIG_USER_ONLY
> > +  .type = ARM_CP_NOP,
> > +#else
> > +  .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
> > +  .writefn = dccvap_writefn,
> > +#endif
> > +},
> >   };
> > -#endif /*CONFIG_USER_ONLY*/
> >   static CPAccessResult access_aa64_tid5(CPUARMState *env, const 
> > ARMCPRegInfo *ri,
> >  bool isread)
> > @@ -9092,7 +9106,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
> >   if (cpu_isar_feature(aa64_tlbios, cpu)) {
> >   define_arm_cp_regs(cpu, tlbios_reginfo);
> >   }
> > -#ifndef CONFIG_USER_ONLY
> >   /* Data Cache clean instructions up to PoP */
> >   if (cpu_isar_feature(aa64_dcpop, cpu)) {
> >   define_one_arm_cp_reg(cpu, dcpop_reg);
> > @@ -9101,7 +9114,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
> >   define_one_arm_cp_reg(cpu, dcpodp_reg);
> >   }
> >   }
> > -#endif /*CONFIG_USER_ONLY*/
> >   /*
> >* If full MTE is enabled, add all of the system registers.
> > diff --git a/tests/tcg/aarch64/Makefile.target 
> > b/tests/tcg/aarch64/Makefile.target
> > index 0315795487..3430fd3cd8 100644
> > --- a/tests/tcg/aarch64/Makefile.target
> > +++ b/tests/tcg/aarch64/Makefile.target
> > @@ -21,12 +21,23 @@ config-cc.mak: Makefile
> > $(quiet-@)( \
> > $(call cc-option,-march=armv8.1-a+sve,  CROSS_CC_HAS_SVE); \
> > $(call cc-option,-march=armv8.1-a+sve2, CROSS_CC_HAS_SVE2); 
> > \
> > +   $(call cc-option,-march=armv8.2-a,  
> > CROSS_CC_HAS_ARMV8_2); \
> > $(call cc-option,-march=armv8.3-a,  
> > CROSS_CC_HAS_ARMV8_3); \
> > +   $(call cc-option,-march=armv8.5-a,  
> > CROSS_CC_HAS_ARMV8_5); \
> > $(call cc-option,-mbranch-protection=standard,  
> > CROSS_CC_HAS_ARMV8_BTI); \
> > $(call cc-option,-march=armv8.5-a+memtag,   
> > CROSS_CC_HAS_ARMV8_MTE); \
> > $(call cc-option,-march=armv9-a+sme,
> > CROSS_CC_HAS_ARMV9_SME)) 3> confi

Re: [PATCH qemu] docs/interop/qcow2.txt: fix description about "zlib" clusters

2023-05-16 Thread Eric Blake



On Tue, May 16, 2023 at 11:32:27PM +0900, ~akihirosuda wrote:
> 
> From: Akihiro Suda 
> 
> "zlib" clusters are actually raw deflate (RFC1951) clusters without
> zlib headers.
> 
> Signed-off-by: Akihiro Suda 
> ---
>  docs/interop/qcow2.txt | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)

Seems like a useful clarification to me (there's a difference between
the encoding name and the program used to create/parse that encoding).

Reviewed-by: Eric Blake 

> 
> diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt
> index f7dc304ff6..e7f036c286 100644
> --- a/docs/interop/qcow2.txt
> +++ b/docs/interop/qcow2.txt
> @@ -214,14 +214,18 @@ version 2.
>  type.
>  
>  If the incompatible bit "Compression type" is set: the 
> field
> -must be present and non-zero (which means non-zlib
> +must be present and non-zero (which means non-deflate
>  compression type). Otherwise, this field must not be 
> present
> -or must be zero (which means zlib).
> +or must be zero (which means deflate).
>  
>  Available compression type values:
> -0: zlib 
> +0: deflate 
>  1: zstd 
>  
> +The deflate compression type is called "zlib"
> + in QEMU. However, clusters with 
> the
> +deflate compression type do not have zlib headers.
> +
>  
>  === Header padding ===
>  
> -- 
> 2.38.4
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v2 6/6] ebpf: Updated eBPF program and skeleton.

2023-05-16 Thread Eric Blake

On Fri, May 12, 2023 at 03:29:02PM +0300, Andrew Melnychenko wrote:
> 
> Updated section name, so libbpf should init/gues proper
> program type without specifications during open/load.
> 
> Signed-off-by: Andrew Melnychenko 
> ---
>  ebpf/rss.bpf.skeleton.h | 1469 ---
>  tools/ebpf/rss.bpf.c|2 +-
>  2 files changed, 741 insertions(+), 730 deletions(-)
> 
> diff --git a/ebpf/rss.bpf.skeleton.h b/ebpf/rss.bpf.skeleton.h
> index 18eb2adb12c..41b84aea44c 100644
> --- a/ebpf/rss.bpf.skeleton.h
> +++ b/ebpf/rss.bpf.skeleton.h
> @@ -176,162 +176,162 @@ err:
>  
>  static inline const void *rss_bpf__elf_bytes(size_t *sz)
>  {
> - *sz = 20440;
> + *sz = 20720;
>   return (const void *)"\
>  
> \x7f\x45\x4c\x46\x02\x01\x01\0\0\0\0\0\0\0\0\0\x01\0\xf7\0\x01\0\0\0\0\0\0\0\0\
> -\0\0\0\0\0\0\0\0\0\0\0\x98\x4c\0\0\0\0\0\0\0\0\0\0\x40\0\0\0\0\0\x40\0\x0d\0\

Appears to be pre-existing, and looking at the broader context, I see
this comment earlier in the file:

/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */

/* THIS FILE IS AUTOGENERATED BY BPFTOOL! */

but a suggestion to improve things: tweak the comment produced by the
generator to also output the name of the source file (not just the
bpftool generator), to make it easier to figure out how to map this
binary blob of data back to human-readable source and instructions on
how to regenerate it should I want to edit those sources.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 2/3] ACPI: i386: bump to MADT to revision 3

2023-05-16 Thread Michael S. Tsirkin

On Tue, May 16, 2023 at 04:22:58PM -0500, Eric DeVolder wrote:
> 
> 
> On 5/16/23 07:51, Ani Sinha wrote:
> > On Tue, May 16, 2023 at 6:01 PM Igor Mammedov  wrote:
> > > 
> > > On Mon, 15 May 2023 16:33:10 -0400
> > > Eric DeVolder  wrote:
> > > 
> > > > Currently i386 QEMU generates MADT revision 3, and reports
> > > > MADT revision 1. Set .revision to 3 to match reality.
> > > > 
> > > > Link: 
> > > > https://lore.kernel.org/linux-acpi/20230327191026.3454-1-eric.devolder@ora
> > > > cle.com/T/#t
> > > > Signed-off-by: Eric DeVolder 
> > > > ---
> > > >   hw/i386/acpi-common.c | 2 +-
> > > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c
> > > > index 52e5c1439a..8a0932fe84 100644
> > > > --- a/hw/i386/acpi-common.c
> > > > +++ b/hw/i386/acpi-common.c
> > > > @@ -102,7 +102,7 @@ void acpi_build_madt(GArray *table_data, BIOSLinker 
> > > > *linker,
> > > >   MachineClass *mc = MACHINE_GET_CLASS(x86ms);
> > > >   const CPUArchIdList *apic_ids = 
> > > > mc->possible_cpu_arch_ids(MACHINE(x86ms));
> > > >   AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(adev);
> > > > -AcpiTable table = { .sig = "APIC", .rev = 1, .oem_id = oem_id,
> > > > +AcpiTable table = { .sig = "APIC", .rev = 3, .oem_id = oem_id,
> > > >   .oem_table_id = oem_table_id };
> > > > 
> > > >   acpi_table_begin(&table, table_data);
> > > 
> > > make check fails for me at this point
> > > (my guess is that not all APIC tables are whitelisted)
> > 
> > I think the patchset needs to be rebased and the blobs regenerated.
> 
> So I've been trying to overcome this today and not having much luck.
> 
> When I run "make check V=2", I see at the end:
> 
> Summary of Failures:
> 
>  45/786 qemu:qtest+qtest-i386 / qtest-i386/bios-tables-test
>  68/786 qemu:qtest+qtest-x86_64 / qtest-x86_64/bios-tables-test
> 
> If I go look at 45/786, for example, I see:
> 
> Looking for expected file 'tests/data/acpi/pc/FACP'
> Using expected file 'tests/data/acpi/pc/FACP'
> Looking for expected file 'tests/data/acpi/pc/APIC'
> Using expected file 'tests/data/acpi/pc/APIC'
> Looking for expected file 'tests/data/acpi/pc/HPET'
> Using expected file 'tests/data/acpi/pc/HPET'
> Looking for expected file 'tests/data/acpi/pc/WAET'
> Using expected file 'tests/data/acpi/pc/WAET'
> Looking for expected file 'tests/data/acpi/pc/FACS'
> Using expected file 'tests/data/acpi/pc/FACS'
> Looking for expected file 'tests/data/acpi/pc/DSDT'
> Using expected file 'tests/data/acpi/pc/DSDT'
> acpi-test: Warning! APIC binary file mismatch. Actual [aml:/tmp/aml-R4D741],
> Expected [aml:tests/data/acpi/pc/APIC].
> See source file tests/qtest/bios-tables-test.c for instructions on how to 
> update expected files.
> acpi-test: Warning! APIC mismatch. Actual [asl:/tmp/asl-GVD741.dsl,
> aml:/tmp/aml-R4D741], Expected [asl:/tmp/asl-1F9641.dsl,
> aml:tests/data/acpi/pc/APIC].
> --- /tmp/asl-1F9641.dsl 2023-05-16 15:18:31.292579156 -0400
> +++ /tmp/asl-GVD741.dsl 2023-05-16 15:18:31.291579149 -0400
> @@ -1,32 +1,32 @@
>  /*
>   * Intel ACPI Component Architecture
>   * AML/ASL+ Disassembler version 20230331 (64-bit version)
>   * Copyright (c) 2000 - 2023 Intel Corporation
>   *
> - * Disassembly of tests/data/acpi/pc/APIC, Tue May 16 15:18:31 2023
> + * Disassembly of /tmp/aml-R4D741, Tue May 16 15:18:31 2023
>   *
>   * ACPI Data Table [APIC]
>   *
>   * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue (in 
> hex)
>   */
> 
>  [000h  004h]   Signature : "APIC"[Multiple APIC 
> Description Table (MADT)]
>  [004h 0004 004h]Table Length : 0078
> -[008h 0008 001h]Revision : 01
> -[009h 0009 001h]Checksum : 8A
> +[008h 0008 001h]Revision : 03
> +[009h 0009 001h]Checksum : 88
>  [00Ah 0010 006h]  Oem ID : "BOCHS "
>  [010h 0016 008h]Oem Table ID : "BXPC"
>  [018h 0024 004h]Oem Revision : 0001
>  [01Ch 0028 004h] Asl Compiler ID : "BXPC"
>  [020h 0032 004h]   Asl Compiler Revision : 0001
> [...]
> 
> And the q35 looks very very similar.
> 
> It suggests that I need to list tests/data/acpi/pc/APIC, which I have done
> in bios-tables-test-allowed-diff.h:
> 
> /* List of comma-separated changed AML files to ignore */
> "tests/data/acpi/pc/APIC",
> "tests/data/acpi/q35/APIC",
> "tests/data/acpi/microvm/APIC",
> "tests/data/acpi/virt/APIC",
> 
> But as I looked closer at the files that changed in the last step
> of the previous post, there are a bunch of them:
> 
>  tests/data/acpi/microvm/APIC  | Bin 70 -> 70 bytes
>  tests/data/acpi/microvm/APIC.ioapic2  | Bin 82 -> 82 bytes
>  tests/data/acpi/microvm/APIC.pcie | Bin 110 -> 110 bytes
>  tests/data/acpi/pc/APIC   | Bin 120 -> 120 bytes
>  tests/data/acp

Re: [PATCH 2/3] ACPI: i386: bump to MADT to revision 3

2023-05-16 Thread Eric DeVolder





On 5/16/23 07:51, Ani Sinha wrote:

On Tue, May 16, 2023 at 6:01 PM Igor Mammedov  wrote:


On Mon, 15 May 2023 16:33:10 -0400
Eric DeVolder  wrote:


Currently i386 QEMU generates MADT revision 3, and reports
MADT revision 1. Set .revision to 3 to match reality.

Link: https://lore.kernel.org/linux-acpi/20230327191026.3454-1-eric.devolder@ora
cle.com/T/#t
Signed-off-by: Eric DeVolder 
---
  hw/i386/acpi-common.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/i386/acpi-common.c b/hw/i386/acpi-common.c
index 52e5c1439a..8a0932fe84 100644
--- a/hw/i386/acpi-common.c
+++ b/hw/i386/acpi-common.c
@@ -102,7 +102,7 @@ void acpi_build_madt(GArray *table_data, BIOSLinker *linker,
  MachineClass *mc = MACHINE_GET_CLASS(x86ms);
  const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(x86ms));
  AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(adev);
-AcpiTable table = { .sig = "APIC", .rev = 1, .oem_id = oem_id,
+AcpiTable table = { .sig = "APIC", .rev = 3, .oem_id = oem_id,
  .oem_table_id = oem_table_id };

  acpi_table_begin(&table, table_data);


make check fails for me at this point
(my guess is that not all APIC tables are whitelisted)


I think the patchset needs to be rebased and the blobs regenerated.


So I've been trying to overcome this today and not having much luck.

When I run "make check V=2", I see at the end:

Summary of Failures:

 45/786 qemu:qtest+qtest-i386 / qtest-i386/bios-tables-test
 68/786 qemu:qtest+qtest-x86_64 / qtest-x86_64/bios-tables-test

If I go look at 45/786, for example, I see:

Looking for expected file 'tests/data/acpi/pc/FACP'
Using expected file 'tests/data/acpi/pc/FACP'
Looking for expected file 'tests/data/acpi/pc/APIC'
Using expected file 'tests/data/acpi/pc/APIC'
Looking for expected file 'tests/data/acpi/pc/HPET'
Using expected file 'tests/data/acpi/pc/HPET'
Looking for expected file 'tests/data/acpi/pc/WAET'
Using expected file 'tests/data/acpi/pc/WAET'
Looking for expected file 'tests/data/acpi/pc/FACS'
Using expected file 'tests/data/acpi/pc/FACS'
Looking for expected file 'tests/data/acpi/pc/DSDT'
Using expected file 'tests/data/acpi/pc/DSDT'
acpi-test: Warning! APIC binary file mismatch. Actual [aml:/tmp/aml-R4D741], Expected 
[aml:tests/data/acpi/pc/APIC].

See source file tests/qtest/bios-tables-test.c for instructions on how to 
update expected files.
acpi-test: Warning! APIC mismatch. Actual [asl:/tmp/asl-GVD741.dsl, aml:/tmp/aml-R4D741], Expected 
[asl:/tmp/asl-1F9641.dsl, aml:tests/data/acpi/pc/APIC].

--- /tmp/asl-1F9641.dsl 2023-05-16 15:18:31.292579156 -0400
+++ /tmp/asl-GVD741.dsl 2023-05-16 15:18:31.291579149 -0400
@@ -1,32 +1,32 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20230331 (64-bit version)
  * Copyright (c) 2000 - 2023 Intel Corporation
  *
- * Disassembly of tests/data/acpi/pc/APIC, Tue May 16 15:18:31 2023
+ * Disassembly of /tmp/aml-R4D741, Tue May 16 15:18:31 2023
  *
  * ACPI Data Table [APIC]
  *
  * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue (in 
hex)
  */

 [000h  004h]   Signature : "APIC"[Multiple APIC 
Description Table (MADT)]
 [004h 0004 004h]Table Length : 0078
-[008h 0008 001h]Revision : 01
-[009h 0009 001h]Checksum : 8A
+[008h 0008 001h]Revision : 03
+[009h 0009 001h]Checksum : 88
 [00Ah 0010 006h]  Oem ID : "BOCHS "
 [010h 0016 008h]Oem Table ID : "BXPC"
 [018h 0024 004h]Oem Revision : 0001
 [01Ch 0028 004h] Asl Compiler ID : "BXPC"
 [020h 0032 004h]   Asl Compiler Revision : 0001
[...]

And the q35 looks very very similar.

It suggests that I need to list tests/data/acpi/pc/APIC, which I have done
in bios-tables-test-allowed-diff.h:

/* List of comma-separated changed AML files to ignore */
"tests/data/acpi/pc/APIC",
"tests/data/acpi/q35/APIC",
"tests/data/acpi/microvm/APIC",
"tests/data/acpi/virt/APIC",

But as I looked closer at the files that changed in the last step
of the previous post, there are a bunch of them:

 tests/data/acpi/microvm/APIC  | Bin 70 -> 70 bytes
 tests/data/acpi/microvm/APIC.ioapic2  | Bin 82 -> 82 bytes
 tests/data/acpi/microvm/APIC.pcie | Bin 110 -> 110 bytes
 tests/data/acpi/pc/APIC   | Bin 120 -> 120 bytes
 tests/data/acpi/pc/APIC.acpihmat  | Bin 128 -> 128 bytes
 tests/data/acpi/pc/APIC.cphp  | Bin 160 -> 160 bytes
 tests/data/acpi/pc/APIC.dimmpxm   | Bin 144 -> 144 bytes
 tests/data/acpi/q35/APIC  | Bin 120 -> 120 bytes
 tests/data/acpi/q35/APIC.acpihmat | Bin 128 -> 128 bytes
 tests/data/acpi/q35/APIC.acpihmat-noinitiator | Bin 144 -> 144 bytes
 tests/data/acpi/q35/APIC.core-count2  | Bin 2478 -> 2478 bytes
 tests/

Re: [PATCH v2 3/6] virtio-net: Added property to load eBPF RSS with fds.

2023-05-16 Thread Eric Blake



On Mon, May 15, 2023 at 10:38:44AM +0100, Daniel P. Berrangé wrote:
> 
> > -static bool virtio_net_load_ebpf(VirtIONet *n)
> > +static bool virtio_net_load_ebpf_fds(VirtIONet *n, Error **errp)
> >  {
> > -if (!virtio_net_attach_ebpf_to_backend(n->nic, -1)) {
> > -/* backend does't support steering ebpf */
> > -return false;
> > +int fds[EBPF_RSS_MAX_FDS] = { [0 ... EBPF_RSS_MAX_FDS - 1] = -1};
> 
> Interesting, I didn't realize this initialization syntax was possible !

It's not standard, but if both gcc and clang support it, it would not
be the first time we've relied on useful compiler extensions.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [PULL 3/4] 9pfs/xen: Fix segfault on shutdown

2023-05-16 Thread Michael Tokarev


16.05.2023 18:21, Christian Schoenebeck wrote:

From: Jason Andryuk 

xen_9pfs_free can't use gnttabdev since it is already closed and NULL-ed
out when free is called.  Do the teardown in _disconnect().  This
matches the setup done in _connect().

trace-events are also added for the XenDevOps functions.


This, while somewhat big, still smells like a stable-8.0 (and stable-7.2) 
material.

/mjt

[PATCH v4 06/10] hw/arm/smmuv3: Make TLB lookup work for stage-2

2023-05-16 Thread Mostafa Saleh

Right now, either stage-1 or stage-2 are supported, this simplifies
how we can deal with TLBs.
This patch makes TLB lookup work if stage-2 is enabled instead of
stage-1.
TLB lookup is done before a PTW, if a valid entry is found we won't
do the PTW.
To be able to do TLB lookup, we need the correct tagging info, as
granularity and input size, so we get this based on the supported
translation stage. The TLB entries are added correctly from each
stage PTW.

When nested translation is supported, this would need to change, for
example if we go with a combined TLB implementation, we would need to
use the min of the granularities in TLB.

As stage-2 shouldn't be tagged by ASID, it will be set to -1 if S1P
is not enabled.

Signed-off-by: Mostafa Saleh 
Reviewed-by: Eric Auger 
---
Changes in v3:
- Rename temp to tt_combined and move to top.
- Collected Reviewed-by tag.
Changes in v2:
- check if S1 is enabled(not supported) when reading S1 TT.
---
 hw/arm/smmuv3.c | 44 +---
 1 file changed, 33 insertions(+), 11 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 27840f2d66..a6714e0420 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -720,6 +720,9 @@ static int smmuv3_decode_config(IOMMUMemoryRegion *mr, 
SMMUTransCfg *cfg,
 STE ste;
 CD cd;
 
+/* ASID defaults to -1 (if s1 is not supported). */
+cfg->asid = -1;
+
 ret = smmu_find_ste(s, sid, &ste, event);
 if (ret) {
 return ret;
@@ -817,6 +820,11 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion 
*mr, hwaddr addr,
 .addr_mask = ~(hwaddr)0,
 .perm = IOMMU_NONE,
 };
+/*
+ * Combined attributes used for TLB lookup, as only one stage is supported,
+ * it will hold attributes based on the enabled stage.
+ */
+SMMUTransTableInfo tt_combined;
 
 qemu_mutex_lock(&s->mutex);
 
@@ -845,21 +853,35 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion 
*mr, hwaddr addr,
 goto epilogue;
 }
 
-tt = select_tt(cfg, addr);
-if (!tt) {
-if (cfg->record_faults) {
-event.type = SMMU_EVT_F_TRANSLATION;
-event.u.f_translation.addr = addr;
-event.u.f_translation.rnw = flag & 0x1;
+if (cfg->stage == 1) {
+/* Select stage1 translation table. */
+tt = select_tt(cfg, addr);
+if (!tt) {
+if (cfg->record_faults) {
+event.type = SMMU_EVT_F_TRANSLATION;
+event.u.f_translation.addr = addr;
+event.u.f_translation.rnw = flag & 0x1;
+}
+status = SMMU_TRANS_ERROR;
+goto epilogue;
 }
-status = SMMU_TRANS_ERROR;
-goto epilogue;
-}
+tt_combined.granule_sz = tt->granule_sz;
+tt_combined.tsz = tt->tsz;
 
-page_mask = (1ULL << (tt->granule_sz)) - 1;
+} else {
+/* Stage2. */
+tt_combined.granule_sz = cfg->s2cfg.granule_sz;
+tt_combined.tsz = cfg->s2cfg.tsz;
+}
+/*
+ * TLB lookup looks for granule and input size for a translation stage,
+ * as only one stage is supported right now, choose the right values
+ * from the configuration.
+ */
+page_mask = (1ULL << tt_combined.granule_sz) - 1;
 aligned_addr = addr & ~page_mask;
 
-cached_entry = smmu_iotlb_lookup(bs, cfg, tt, aligned_addr);
+cached_entry = smmu_iotlb_lookup(bs, cfg, &tt_combined, aligned_addr);
 if (cached_entry) {
 if ((flag & IOMMU_WO) && !(cached_entry->entry.perm & IOMMU_WO)) {
 status = SMMU_TRANS_ERROR;
-- 
2.40.1.606.ga4b1b128d6-goog

[PATCH v4 08/10] hw/arm/smmuv3: Add CMDs related to stage-2

2023-05-16 Thread Mostafa Saleh

CMD_TLBI_S2_IPA: As S1+S2 is not enabled, for now this can be the
same as CMD_TLBI_NH_VAA.

CMD_TLBI_S12_VMALL: Added new function to invalidate TLB by VMID.

For stage-1 only commands, add a check to throw CERROR_ILL if used
when stage-1 is not supported.

Reviewed-by: Eric Auger 
Signed-off-by: Mostafa Saleh 
---
Changes im v4:
- Collected Reviewed-by tag
- Add SMMU_CMD_TLBI_S12_VMALL in a block
Changes in v3:
- Log guest error for all illegal commands.
Changes in v2:
- Add checks for stage-1 only commands
- Rename smmuv3_s1_range_inval to smmuv3_range_inval
---
 hw/arm/smmu-common.c | 16 +++
 hw/arm/smmuv3.c  | 55 ++--
 hw/arm/trace-events  |  4 ++-
 include/hw/arm/smmu-common.h |  1 +
 4 files changed, 67 insertions(+), 9 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 6109beaa70..5ab9d45d58 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -135,6 +135,16 @@ static gboolean smmu_hash_remove_by_asid(gpointer key, 
gpointer value,
 
 return SMMU_IOTLB_ASID(*iotlb_key) == asid;
 }
+
+static gboolean smmu_hash_remove_by_vmid(gpointer key, gpointer value,
+ gpointer user_data)
+{
+uint16_t vmid = *(uint16_t *)user_data;
+SMMUIOTLBKey *iotlb_key = (SMMUIOTLBKey *)key;
+
+return SMMU_IOTLB_VMID(*iotlb_key) == vmid;
+}
+
 static gboolean smmu_hash_remove_by_asid_vmid_iova(gpointer key, gpointer 
value,
   gpointer user_data)
 {
@@ -187,6 +197,12 @@ void smmu_iotlb_inv_asid(SMMUState *s, uint16_t asid)
 g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_asid, &asid);
 }
 
+inline void smmu_iotlb_inv_vmid(SMMUState *s, uint16_t vmid)
+{
+trace_smmu_iotlb_inv_vmid(vmid);
+g_hash_table_foreach_remove(s->iotlb, smmu_hash_remove_by_vmid, &vmid);
+}
+
 /* VMSAv8-64 Translation */
 
 /**
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 64284395c2..3643befc9e 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -1062,7 +1062,7 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int 
asid, dma_addr_t iova,
 }
 }
 
-static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
+static void smmuv3_range_inval(SMMUState *s, Cmd *cmd)
 {
 dma_addr_t end, addr = CMD_ADDR(cmd);
 uint8_t type = CMD_TYPE(cmd);
@@ -1087,7 +1087,7 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
 }
 
 if (!tg) {
-trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, 1, ttl, leaf);
+trace_smmuv3_range_inval(vmid, asid, addr, tg, 1, ttl, leaf);
 smmuv3_inv_notifiers_iova(s, asid, addr, tg, 1);
 smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, 1, ttl);
 return;
@@ -1105,7 +1105,7 @@ static void smmuv3_s1_range_inval(SMMUState *s, Cmd *cmd)
 uint64_t mask = dma_aligned_pow2_mask(addr, end, 64);
 
 num_pages = (mask + 1) >> granule;
-trace_smmuv3_s1_range_inval(vmid, asid, addr, tg, num_pages, ttl, 
leaf);
+trace_smmuv3_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf);
 smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
 smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, num_pages, ttl);
 addr += mask + 1;
@@ -1239,12 +1239,22 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
 {
 uint16_t asid = CMD_ASID(&cmd);
 
+if (!STAGE1_SUPPORTED(s)) {
+cmd_error = SMMU_CERROR_ILL;
+break;
+}
+
 trace_smmuv3_cmdq_tlbi_nh_asid(asid);
 smmu_inv_notifiers_all(&s->smmu_state);
 smmu_iotlb_inv_asid(bs, asid);
 break;
 }
 case SMMU_CMD_TLBI_NH_ALL:
+if (!STAGE1_SUPPORTED(s)) {
+cmd_error = SMMU_CERROR_ILL;
+break;
+}
+QEMU_FALLTHROUGH;
 case SMMU_CMD_TLBI_NSNH_ALL:
 trace_smmuv3_cmdq_tlbi_nh();
 smmu_inv_notifiers_all(&s->smmu_state);
@@ -1252,7 +1262,36 @@ static int smmuv3_cmdq_consume(SMMUv3State *s)
 break;
 case SMMU_CMD_TLBI_NH_VAA:
 case SMMU_CMD_TLBI_NH_VA:
-smmuv3_s1_range_inval(bs, &cmd);
+if (!STAGE1_SUPPORTED(s)) {
+cmd_error = SMMU_CERROR_ILL;
+break;
+}
+smmuv3_range_inval(bs, &cmd);
+break;
+case SMMU_CMD_TLBI_S12_VMALL:
+{
+uint16_t vmid = CMD_VMID(&cmd);
+
+if (!STAGE2_SUPPORTED(s)) {
+cmd_error = SMMU_CERROR_ILL;
+break;
+}
+
+trace_smmuv3_cmdq_tlbi_s12_vmid(vmid);
+smmu_inv_notifiers_all(&s->smmu_state);
+smmu_iotlb_inv_vmid(bs, vmid);
+break;
+}
+case SMMU_CMD_TLBI_S2_IPA:
+if (!STAGE2_SUPPORTED(s)) {
+cmd_error = SMMU_CERROR_ILL;
+break;
+}

[PATCH v4 03/10] hw/arm/smmuv3: Refactor stage-1 PTW

2023-05-16 Thread Mostafa Saleh

In preparation for adding stage-2 support, rename smmu_ptw_64 to
smmu_ptw_64_s1 and refactor some of the code so it can be reused in
stage-2 page table walk.

Remove AA64 check from PTW as decode_cd already ensures that AA64 is
used, otherwise it faults with C_BAD_CD.

A stage member is added to SMMUPTWEventInfo to differentiate
between stage-1 and stage-2 ptw faults.

Add stage argument to trace_smmu_ptw_level be consistent with other
trace events.

Signed-off-by: Mostafa Saleh 
Reviewed-by: Eric Auger 
---
Changes in v3:
- Collected Reviewed-by tag
- Rename translation consts and macros from SMMU_* to VMSA_*
Changes in v2:
- Refactor common functions to be use in stage-2.
- Add stage to SMMUPTWEventInfo.
- Remove AA64 check.
---
 hw/arm/smmu-common.c | 27 ++-
 hw/arm/smmuv3.c  |  2 ++
 hw/arm/trace-events  |  2 +-
 include/hw/arm/smmu-common.h | 16 +---
 4 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index e7f1c1f219..50391a8c94 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -264,7 +264,7 @@ SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t 
iova)
 }
 
 /**
- * smmu_ptw_64 - VMSAv8-64 Walk of the page tables for a given IOVA
+ * smmu_ptw_64_s1 - VMSAv8-64 Walk of the page tables for a given IOVA
  * @cfg: translation config
  * @iova: iova to translate
  * @perm: access type
@@ -276,9 +276,9 @@ SMMUTransTableInfo *select_tt(SMMUTransCfg *cfg, dma_addr_t 
iova)
  * Upon success, @tlbe is filled with translated_addr and entry
  * permission rights.
  */
-static int smmu_ptw_64(SMMUTransCfg *cfg,
-   dma_addr_t iova, IOMMUAccessFlags perm,
-   SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
+static int smmu_ptw_64_s1(SMMUTransCfg *cfg,
+  dma_addr_t iova, IOMMUAccessFlags perm,
+  SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
 {
 dma_addr_t baseaddr, indexmask;
 int stage = cfg->stage;
@@ -291,14 +291,14 @@ static int smmu_ptw_64(SMMUTransCfg *cfg,
 }
 
 granule_sz = tt->granule_sz;
-stride = granule_sz - 3;
+stride = VMSA_STRIDE(granule_sz);
 inputsize = 64 - tt->tsz;
 level = 4 - (inputsize - 4) / stride;
-indexmask = (1ULL << (inputsize - (stride * (4 - level - 1;
+indexmask = VMSA_IDXMSK(inputsize, stride, level);
 baseaddr = extract64(tt->ttb, 0, 48);
 baseaddr &= ~indexmask;
 
-while (level <= 3) {
+while (level < VMSA_LEVELS) {
 uint64_t subpage_size = 1ULL << level_shift(level, granule_sz);
 uint64_t mask = subpage_size - 1;
 uint32_t offset = iova_level_offset(iova, inputsize, level, 
granule_sz);
@@ -309,7 +309,7 @@ static int smmu_ptw_64(SMMUTransCfg *cfg,
 if (get_pte(baseaddr, offset, &pte, info)) {
 goto error;
 }
-trace_smmu_ptw_level(level, iova, subpage_size,
+trace_smmu_ptw_level(stage, level, iova, subpage_size,
  baseaddr, offset, pte);
 
 if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
@@ -358,6 +358,7 @@ static int smmu_ptw_64(SMMUTransCfg *cfg,
 info->type = SMMU_PTW_ERR_TRANSLATION;
 
 error:
+info->stage = 1;
 tlbe->entry.perm = IOMMU_NONE;
 return -EINVAL;
 }
@@ -376,15 +377,7 @@ error:
 int smmu_ptw(SMMUTransCfg *cfg, dma_addr_t iova, IOMMUAccessFlags perm,
  SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
 {
-if (!cfg->aa64) {
-/*
- * This code path is not entered as we check this while decoding
- * the configuration data in the derived SMMU model.
- */
-g_assert_not_reached();
-}
-
-return smmu_ptw_64(cfg, iova, perm, tlbe, info);
+return smmu_ptw_64_s1(cfg, iova, perm, tlbe, info);
 }
 
 /**
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 270c80b665..4e90343996 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -716,6 +716,8 @@ static IOMMUTLBEntry smmuv3_translate(IOMMUMemoryRegion 
*mr, hwaddr addr,
 cached_entry = g_new0(SMMUTLBEntry, 1);
 
 if (smmu_ptw(cfg, aligned_addr, flag, cached_entry, &ptw_info)) {
+/* All faults from PTW has S2 field. */
+event.u.f_walk_eabt.s2 = (ptw_info.stage == 2);
 g_free(cached_entry);
 switch (ptw_info.type) {
 case SMMU_PTW_ERR_WALK_EABT:
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index 2dee296c8f..205ac04573 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -5,7 +5,7 @@ virt_acpi_setup(void) "No fw cfg or ACPI disabled. Bailing out."
 
 # smmu-common.c
 smmu_add_mr(const char *name) "%s"
-smmu_ptw_level(int level, uint64_t iova, size_t subpage_size, uint64_t 
baseaddr, uint32_t offset, uint64_t pte) "level=%d iova=0x%"PRIx64" 
subpage_sz=0x%zx baseaddr=0x%"PRIx64" offset=%d => pte=0x%"PRIx64
+smmu_ptw_level(int stage, int level, uint64_t iova, size_t subpage_size, 
uint64_t base

[PATCH v4 09/10] hw/arm/smmuv3: Add stage-2 support in iova notifier

2023-05-16 Thread Mostafa Saleh

In smmuv3_notify_iova, read the granule based on translation stage
and use VMID if valid value is sent.

Signed-off-by: Mostafa Saleh 
Reviewed-by: Eric Auger 
---
Changes in v3:
- Collected Reviewed-by tag.
---
 hw/arm/smmuv3.c | 39 ++-
 hw/arm/trace-events |  2 +-
 2 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 3643befc9e..17e1359be4 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -999,18 +999,21 @@ epilogue:
  * @mr: IOMMU mr region handle
  * @n: notifier to be called
  * @asid: address space ID or negative value if we don't care
+ * @vmid: virtual machine ID or negative value if we don't care
  * @iova: iova
  * @tg: translation granule (if communicated through range invalidation)
  * @num_pages: number of @granule sized pages (if tg != 0), otherwise 1
  */
 static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
IOMMUNotifier *n,
-   int asid, dma_addr_t iova,
-   uint8_t tg, uint64_t num_pages)
+   int asid, int vmid,
+   dma_addr_t iova, uint8_t tg,
+   uint64_t num_pages)
 {
 SMMUDevice *sdev = container_of(mr, SMMUDevice, iommu);
 IOMMUTLBEvent event;
 uint8_t granule;
+SMMUv3State *s = sdev->smmu;
 
 if (!tg) {
 SMMUEventInfo event = {.inval_ste_allowed = true};
@@ -1025,11 +1028,20 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
 return;
 }
 
-tt = select_tt(cfg, iova);
-if (!tt) {
+if (vmid >= 0 && cfg->s2cfg.vmid != vmid) {
 return;
 }
-granule = tt->granule_sz;
+
+if (STAGE1_SUPPORTED(s)) {
+tt = select_tt(cfg, iova);
+if (!tt) {
+return;
+}
+granule = tt->granule_sz;
+} else {
+granule = cfg->s2cfg.granule_sz;
+}
+
 } else {
 granule = tg * 2 + 10;
 }
@@ -1043,9 +1055,10 @@ static void smmuv3_notify_iova(IOMMUMemoryRegion *mr,
 memory_region_notify_iommu_one(n, &event);
 }
 
-/* invalidate an asid/iova range tuple in all mr's */
-static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, dma_addr_t iova,
-  uint8_t tg, uint64_t num_pages)
+/* invalidate an asid/vmid/iova range tuple in all mr's */
+static void smmuv3_inv_notifiers_iova(SMMUState *s, int asid, int vmid,
+  dma_addr_t iova, uint8_t tg,
+  uint64_t num_pages)
 {
 SMMUDevice *sdev;
 
@@ -1053,11 +1066,11 @@ static void smmuv3_inv_notifiers_iova(SMMUState *s, int 
asid, dma_addr_t iova,
 IOMMUMemoryRegion *mr = &sdev->iommu;
 IOMMUNotifier *n;
 
-trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, iova,
-tg, num_pages);
+trace_smmuv3_inv_notifiers_iova(mr->parent_obj.name, asid, vmid,
+iova, tg, num_pages);
 
 IOMMU_NOTIFIER_FOREACH(n, mr) {
-smmuv3_notify_iova(mr, n, asid, iova, tg, num_pages);
+smmuv3_notify_iova(mr, n, asid, vmid, iova, tg, num_pages);
 }
 }
 }
@@ -1088,7 +1101,7 @@ static void smmuv3_range_inval(SMMUState *s, Cmd *cmd)
 
 if (!tg) {
 trace_smmuv3_range_inval(vmid, asid, addr, tg, 1, ttl, leaf);
-smmuv3_inv_notifiers_iova(s, asid, addr, tg, 1);
+smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, 1);
 smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, 1, ttl);
 return;
 }
@@ -1106,7 +1119,7 @@ static void smmuv3_range_inval(SMMUState *s, Cmd *cmd)
 
 num_pages = (mask + 1) >> granule;
 trace_smmuv3_range_inval(vmid, asid, addr, tg, num_pages, ttl, leaf);
-smmuv3_inv_notifiers_iova(s, asid, addr, tg, num_pages);
+smmuv3_inv_notifiers_iova(s, asid, vmid, addr, tg, num_pages);
 smmu_iotlb_inv_iova(s, asid, vmid, addr, tg, num_pages, ttl);
 addr += mask + 1;
 }
diff --git a/hw/arm/trace-events b/hw/arm/trace-events
index f8fdf1ca9f..cdc1ea06a8 100644
--- a/hw/arm/trace-events
+++ b/hw/arm/trace-events
@@ -53,5 +53,5 @@ smmuv3_cmdq_tlbi_s12_vmid(uint16_t vmid) "vmid=%d"
 smmuv3_config_cache_inv(uint32_t sid) "Config cache INV for sid=0x%x"
 smmuv3_notify_flag_add(const char *iommu) "ADD SMMUNotifier node for iommu 
mr=%s"
 smmuv3_notify_flag_del(const char *iommu) "DEL SMMUNotifier node for iommu 
mr=%s"
-smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint64_t iova, 
uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d iova=0x%"PRIx64" tg=%d 
num_pages=0x%"PRIx64
+smmuv3_inv_notifiers_iova(const char *name, uint16_t asid, uint16_t vmid, 
uint64_t iova, uint8_t tg, uint64_t num_pages) "iommu mr=%s asid=%d vmid=%d 
iova=0x%"PRIx64" tg=%d num_pages

[PATCH v4 07/10] hw/arm/smmuv3: Add VMID to TLB tagging

2023-05-16 Thread Mostafa Saleh

Allow TLB to be tagged with VMID.

If stage-1 is only supported, VMID is set to -1 and ignored from STE
and CMD_TLBI_NH* cmds.

Update smmu_iotlb_insert trace event to have vmid.

Signed-off-by: Mostafa Saleh 
Reviewed-by: Eric Auger 
---
Changes in v3:
- Collected Reviewed-by tag.
Changes in v2:
-Fix TLB aliasing issue from missing check in smmu_iotlb_key_equal.
-Add vmid to traces smmu_iotlb_insert and smmu_iotlb_lookup_hit/miss.
-Add vmid to hash function.
---
 hw/arm/smmu-common.c | 36 ++--
 hw/arm/smmu-internal.h   |  2 ++
 hw/arm/smmuv3.c  | 12 +---
 hw/arm/trace-events  |  6 +++---
 include/hw/arm/smmu-common.h |  5 +++--
 5 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 3e82eab741..6109beaa70 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -38,7 +38,7 @@ static guint smmu_iotlb_key_hash(gconstpointer v)
 
 /* Jenkins hash */
 a = b = c = JHASH_INITVAL + sizeof(*key);
-a += key->asid + key->level + key->tg;
+a += key->asid + key->vmid + key->level + key->tg;
 b += extract64(key->iova, 0, 32);
 c += extract64(key->iova, 32, 32);
 
@@ -53,13 +53,15 @@ static gboolean smmu_iotlb_key_equal(gconstpointer v1, 
gconstpointer v2)
 SMMUIOTLBKey *k1 = (SMMUIOTLBKey *)v1, *k2 = (SMMUIOTLBKey *)v2;
 
 return (k1->asid == k2->asid) && (k1->iova == k2->iova) &&
-   (k1->level == k2->level) && (k1->tg == k2->tg);
+   (k1->level == k2->level) && (k1->tg == k2->tg) &&
+   (k1->vmid == k2->vmid);
 }
 
-SMMUIOTLBKey smmu_get_iotlb_key(uint16_t asid, uint64_t iova,
+SMMUIOTLBKey smmu_get_iotlb_key(uint16_t asid, uint16_t vmid, uint64_t iova,
 uint8_t tg, uint8_t level)
 {
-SMMUIOTLBKey key = {.asid = asid, .iova = iova, .tg = tg, .level = level};
+SMMUIOTLBKey key = {.asid = asid, .vmid = vmid, .iova = iova,
+.tg = tg, .level = level};
 
 return key;
 }
@@ -78,7 +80,8 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg 
*cfg,
 uint64_t mask = subpage_size - 1;
 SMMUIOTLBKey key;
 
-key = smmu_get_iotlb_key(cfg->asid, iova & ~mask, tg, level);
+key = smmu_get_iotlb_key(cfg->asid, cfg->s2cfg.vmid,
+ iova & ~mask, tg, level);
 entry = g_hash_table_lookup(bs->iotlb, &key);
 if (entry) {
 break;
@@ -88,13 +91,13 @@ SMMUTLBEntry *smmu_iotlb_lookup(SMMUState *bs, SMMUTransCfg 
*cfg,
 
 if (entry) {
 cfg->iotlb_hits++;
-trace_smmu_iotlb_lookup_hit(cfg->asid, iova,
+trace_smmu_iotlb_lookup_hit(cfg->asid, cfg->s2cfg.vmid, iova,
 cfg->iotlb_hits, cfg->iotlb_misses,
 100 * cfg->iotlb_hits /
 (cfg->iotlb_hits + cfg->iotlb_misses));
 } else {
 cfg->iotlb_misses++;
-trace_smmu_iotlb_lookup_miss(cfg->asid, iova,
+trace_smmu_iotlb_lookup_miss(cfg->asid, cfg->s2cfg.vmid, iova,
  cfg->iotlb_hits, cfg->iotlb_misses,
  100 * cfg->iotlb_hits /
  (cfg->iotlb_hits + cfg->iotlb_misses));
@@ -111,8 +114,10 @@ void smmu_iotlb_insert(SMMUState *bs, SMMUTransCfg *cfg, 
SMMUTLBEntry *new)
 smmu_iotlb_inv_all(bs);
 }
 
-*key = smmu_get_iotlb_key(cfg->asid, new->entry.iova, tg, new->level);
-trace_smmu_iotlb_insert(cfg->asid, new->entry.iova, tg, new->level);
+*key = smmu_get_iotlb_key(cfg->asid, cfg->s2cfg.vmid, new->entry.iova,
+  tg, new->level);
+trace_smmu_iotlb_insert(cfg->asid, cfg->s2cfg.vmid, new->entry.iova,
+tg, new->level);
 g_hash_table_insert(bs->iotlb, key, new);
 }
 
@@ -130,8 +135,7 @@ static gboolean smmu_hash_remove_by_asid(gpointer key, 
gpointer value,
 
 return SMMU_IOTLB_ASID(*iotlb_key) == asid;
 }
-
-static gboolean smmu_hash_remove_by_asid_iova(gpointer key, gpointer value,
+static gboolean smmu_hash_remove_by_asid_vmid_iova(gpointer key, gpointer 
value,
   gpointer user_data)
 {
 SMMUTLBEntry *iter = (SMMUTLBEntry *)value;
@@ -142,18 +146,21 @@ static gboolean smmu_hash_remove_by_asid_iova(gpointer 
key, gpointer value,
 if (info->asid >= 0 && info->asid != SMMU_IOTLB_ASID(iotlb_key)) {
 return false;
 }
+if (info->vmid >= 0 && info->vmid != SMMU_IOTLB_VMID(iotlb_key)) {
+return false;
+}
 return ((info->iova & ~entry->addr_mask) == entry->iova) ||
((entry->iova & ~info->mask) == info->iova);
 }
 
-void smmu_iotlb_inv_iova(SMMUState *s, int asid, dma_addr_t iova,
+void smmu_iotlb_inv_iova(SMMUState *s, int asid, int vmid, dma_addr_t iova,
  uint8_t tg, uint64_t num_pages, ui

[PATCH v4 04/10] hw/arm/smmuv3: Add page table walk for stage-2

2023-05-16 Thread Mostafa Saleh

In preparation for adding stage-2 support, add Stage-2 PTW code.
Only Aarch64 format is supported as stage-1.

Nesting stage-1 and stage-2 is not supported right now.

HTTU is not supported, SW is expected to maintain the Access flag.
This is described in the SMMUv3 manual(IHI 0070.E.a)
"5.2. Stream Table Entry" in "[181] S2AFFD".
This flag determines the behavior on access of a stage-2 page whose
descriptor has AF == 0:
- 0b0: An Access flag fault occurs (stall not supported).
- 0b1: An Access flag fault never occurs.
An Access fault takes priority over a Permission fault.

There are 3 address size checks for stage-2 according to
(IHI 0070.E.a) in "3.4. Address sizes".
- As nesting is not supported, input address is passed directly to
stage-2, and is checked against IAS.
We use cfg->oas to hold the OAS when stage-1 is not used, this is set
in the next patch.
This check is done outside of smmu_ptw_64_s2 as it is not part of
stage-2(it throws stage-1 fault), and the stage-2 function shouldn't
change it's behavior when nesting is supported.
When nesting is supported and we figure out how to combine TLB for
stage-1 and stage-2 we can move this check into the stage-1 function
as described in ARM DDI0487I.a in pseudocode
aarch64/translation/vmsa_translation/AArch64.S1Translate
aarch64/translation/vmsa_translation/AArch64.S1DisabledOutput

- Input to stage-2 is checked against s2t0sz, and throws stage-2
transaltion fault if exceeds it.

- Output of stage-2 is checked against effective PA output range.

Reviewed-by: Eric Auger 
Signed-off-by: Mostafa Saleh 
---
Changes in v4:
- Collected Reviewed-by tag
- s/IHI 0070.E/IHI 0070.E.a
Changes in v3:
- Fix IPA address size check.
- s2cfg.oas renamed to s2cfg.eff_ps.
- s/iova/ipa
- s/ap/s2ap
- s/gran/granule_sz
- use level_shift instead of inline code.
- Add missing brackets in is_permission_fault_s2.
- Use new VMSA_* macros and functions instead of SMMU_*
- Rename pgd_idx to pgd_concat_idx.
- Move SMMU_MAX_S2_CONCAT to STE patch as it is not used here.
Changes in v2:
- Squash S2AFF PTW code.
- Use common functions between stage-1 and stage-2.
- Add checks for IPA and out PA.
---
 hw/arm/smmu-common.c   | 142 -
 hw/arm/smmu-internal.h |  35 ++
 2 files changed, 176 insertions(+), 1 deletion(-)

diff --git a/hw/arm/smmu-common.c b/hw/arm/smmu-common.c
index 50391a8c94..3e82eab741 100644
--- a/hw/arm/smmu-common.c
+++ b/hw/arm/smmu-common.c
@@ -363,6 +363,127 @@ error:
 return -EINVAL;
 }
 
+/**
+ * smmu_ptw_64_s2 - VMSAv8-64 Walk of the page tables for a given ipa
+ * for stage-2.
+ * @cfg: translation config
+ * @ipa: ipa to translate
+ * @perm: access type
+ * @tlbe: SMMUTLBEntry (out)
+ * @info: handle to an error info
+ *
+ * Return 0 on success, < 0 on error. In case of error, @info is filled
+ * and tlbe->perm is set to IOMMU_NONE.
+ * Upon success, @tlbe is filled with translated_addr and entry
+ * permission rights.
+ */
+static int smmu_ptw_64_s2(SMMUTransCfg *cfg,
+  dma_addr_t ipa, IOMMUAccessFlags perm,
+  SMMUTLBEntry *tlbe, SMMUPTWEventInfo *info)
+{
+const int stage = 2;
+int granule_sz = cfg->s2cfg.granule_sz;
+/* ARM DDI0487I.a: Table D8-7. */
+int inputsize = 64 - cfg->s2cfg.tsz;
+int level = get_start_level(cfg->s2cfg.sl0, granule_sz);
+int stride = VMSA_STRIDE(granule_sz);
+int idx = pgd_concat_idx(level, granule_sz, ipa);
+/*
+ * Get the ttb from concatenated structure.
+ * The offset is the idx * size of each ttb(number of ptes * (sizeof(pte))
+ */
+uint64_t baseaddr = extract64(cfg->s2cfg.vttb, 0, 48) + (1 << stride) *
+  idx * sizeof(uint64_t);
+dma_addr_t indexmask = VMSA_IDXMSK(inputsize, stride, level);
+
+baseaddr &= ~indexmask;
+
+/*
+ * On input, a stage 2 Translation fault occurs if the IPA is outside the
+ * range configured by the relevant S2T0SZ field of the STE.
+ */
+if (ipa >= (1ULL << inputsize)) {
+info->type = SMMU_PTW_ERR_TRANSLATION;
+goto error;
+}
+
+while (level < VMSA_LEVELS) {
+uint64_t subpage_size = 1ULL << level_shift(level, granule_sz);
+uint64_t mask = subpage_size - 1;
+uint32_t offset = iova_level_offset(ipa, inputsize, level, granule_sz);
+uint64_t pte, gpa;
+dma_addr_t pte_addr = baseaddr + offset * sizeof(pte);
+uint8_t s2ap;
+
+if (get_pte(baseaddr, offset, &pte, info)) {
+goto error;
+}
+trace_smmu_ptw_level(stage, level, ipa, subpage_size,
+ baseaddr, offset, pte);
+if (is_invalid_pte(pte) || is_reserved_pte(pte, level)) {
+trace_smmu_ptw_invalid_pte(stage, level, baseaddr,
+   pte_addr, offset, pte);
+break;
+}
+
+if (is_table_pte(pte, level)) {
+baseaddr = get_table_pte_add

[PATCH v4 02/10] hw/arm/smmuv3: Update translation config to hold stage-2

2023-05-16 Thread Mostafa Saleh

In preparation for adding stage-2 support, add a S2 config
struct(SMMUS2Cfg), composed of the following fields and embedded in
the main SMMUTransCfg:
 -tsz: Size of IPA input region (S2T0SZ)
 -sl0: Start level of translation (S2SL0)
 -affd: AF Fault Disable (S2AFFD)
 -record_faults: Record fault events (S2R)
 -granule_sz: Granule page shift (based on S2TG)
 -vmid: Virtual Machine ID (S2VMID)
 -vttb: Address of translation table base (S2TTB)
 -eff_ps: Effective PA output range (based on S2PS)

They will be used in the next patches in stage-2 address translation.

The fields in SMMUS2Cfg, are reordered to make the shared and stage-1
fields next to each other, this reordering didn't change the struct
size (104 bytes before and after).

Stage-1 only fields: aa64, asid, tt, ttb, tbi, record_faults, oas.
oas is stage-1 output address size. However, it is used to check
input address in case stage-1 is unimplemented or bypassed according
to SMMUv3 manual IHI0070.E "3.4. Address sizes"

Shared fields: stage, disabled, bypassed, aborted, iotlb_*.

No functional change intended.

Reviewed-by: Eric Auger 
Signed-off-by: Mostafa Saleh 
---
Changes in v4:
-Collected Reviewed-by tag
Changes in v3:
-Add record_faults for stage-2
-Reorder and document fields in SMMUTransCfg based on stage
-Rename oas in SMMUS2Cfg to eff_ps
-Improve comments in SMMUS2Cfg
Changes in v2:
-Add oas
---
 include/hw/arm/smmu-common.h | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/include/hw/arm/smmu-common.h b/include/hw/arm/smmu-common.h
index 9fcff26357..9cf3f37929 100644
--- a/include/hw/arm/smmu-common.h
+++ b/include/hw/arm/smmu-common.h
@@ -58,25 +58,41 @@ typedef struct SMMUTLBEntry {
 uint8_t granule;
 } SMMUTLBEntry;
 
+/* Stage-2 configuration. */
+typedef struct SMMUS2Cfg {
+uint8_t tsz;/* Size of IPA input region (S2T0SZ) */
+uint8_t sl0;/* Start level of translation (S2SL0) */
+bool affd;  /* AF Fault Disable (S2AFFD) */
+bool record_faults; /* Record fault events (S2R) */
+uint8_t granule_sz; /* Granule page shift (based on S2TG) */
+uint8_t eff_ps; /* Effective PA output range (based on S2PS) */
+uint16_t vmid;  /* Virtual Machine ID (S2VMID) */
+uint64_t vttb;  /* Address of translation table base (S2TTB) */
+} SMMUS2Cfg;
+
 /*
  * Generic structure populated by derived SMMU devices
  * after decoding the configuration information and used as
  * input to the page table walk
  */
 typedef struct SMMUTransCfg {
+/* Shared fields between stage-1 and stage-2. */
 int stage; /* translation stage */
-bool aa64; /* arch64 or aarch32 translation table */
 bool disabled; /* smmu is disabled */
 bool bypassed; /* translation is bypassed */
 bool aborted;  /* translation is aborted */
+uint32_t iotlb_hits;   /* counts IOTLB hits */
+uint32_t iotlb_misses; /* counts IOTLB misses*/
+/* Used by stage-1 only. */
+bool aa64; /* arch64 or aarch32 translation table */
 bool record_faults;/* record fault events */
 uint64_t ttb;  /* TT base address */
 uint8_t oas;   /* output address width */
 uint8_t tbi;   /* Top Byte Ignore */
 uint16_t asid;
 SMMUTransTableInfo tt[2];
-uint32_t iotlb_hits;   /* counts IOTLB hits for this asid */
-uint32_t iotlb_misses; /* counts IOTLB misses for this asid */
+/* Used by stage-2 only. */
+struct SMMUS2Cfg s2cfg;
 } SMMUTransCfg;
 
 typedef struct SMMUDevice {
-- 
2.40.1.606.ga4b1b128d6-goog

[PATCH v4 01/10] hw/arm/smmuv3: Add missing fields for IDR0

2023-05-16 Thread Mostafa Saleh

In preparation for adding stage-2 support.
Add IDR0 fields related to stage-2.

VMID16: 16-bit VMID supported.
S2P: Stage-2 translation supported.

They are described in 6.3.1 SMMU_IDR0.

No functional change intended.

Reviewed-by: Richard Henderson 
Reviewed-by: Eric Auger 
Signed-off-by: Mostafa Saleh 

---
Changes in V2:
- Collected Reviewed-by tags.
---
 hw/arm/smmuv3-internal.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index e8f0ebf25e..183d5ac8dc 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -34,10 +34,12 @@ typedef enum SMMUTranslationStatus {
 /* MMIO Registers */
 
 REG32(IDR0,0x0)
+FIELD(IDR0, S2P, 0 , 1)
 FIELD(IDR0, S1P, 1 , 1)
 FIELD(IDR0, TTF, 2 , 2)
 FIELD(IDR0, COHACC,  4 , 1)
 FIELD(IDR0, ASID16,  12, 1)
+FIELD(IDR0, VMID16,  18, 1)
 FIELD(IDR0, TTENDIAN,21, 2)
 FIELD(IDR0, STALL_MODEL, 24, 2)
 FIELD(IDR0, TERM_MODEL,  26, 1)
-- 
2.40.1.606.ga4b1b128d6-goog

[PATCH v4 05/10] hw/arm/smmuv3: Parse STE config for stage-2

2023-05-16 Thread Mostafa Saleh

Parse stage-2 configuration from STE and populate it in SMMUS2Cfg.
Validity of field values are checked when possible.

Only AA64 tables are supported and Small Translation Tables (STT) are
not supported.

According to SMMUv3 UM(IHI0070E) "5.2 Stream Table Entry": All fields
with an S2 prefix (with the exception of S2VMID) are IGNORED when
stage-2 bypasses translation (Config[1] == 0).

Which means that VMID can be used(for TLB tagging) even if stage-2 is
bypassed, so we parse it unconditionally when S2P exists. Otherwise
it is set to -1.(only S1P)

As stall is not supported, if S2S is set the translation would abort.
For S2R, we reuse the same code used for stage-1 with flag
record_faults. However when nested translation is supported we would
need to separate stage-1 and stage-2 faults.

Fix wrong shift in STE_S2HD, STE_S2HA, STE_S2S.

Signed-off-by: Mostafa Saleh 
---
Changes in V4:
- Rename and simplify PTW_FAULT_ALLOWED
- Fix comment indent
Changes in V3:
- Separate fault handling.
- Fix shift in STE_S2HD, STE_S2HA, STE_S2S, STE_S2R.
- Rename t0sz_valid to s2t0sz_valid.
- separate stage-2 STE parsing in decode_ste_s2_cfg.
- Add a log for invalid S2ENDI and S2TTB.
- Set default value for stage-1 OAS.
- Move and rename SMMU_MAX_S2_CONCAT to VMSA_MAX_S2_CONCAT.
Changes in V2:
- Parse S2PS and S2ENDI
- Squash with S2VMID parsing patch
- Squash with S2AFF parsing
- Squash with fault reporting patch
- Add check for S2T0SZ
- Renaming and refactoring code
---
 hw/arm/smmuv3-internal.h |  10 +-
 hw/arm/smmuv3.c  | 181 +--
 include/hw/arm/smmu-common.h |   1 +
 include/hw/arm/smmuv3.h  |   3 +
 4 files changed, 185 insertions(+), 10 deletions(-)

diff --git a/hw/arm/smmuv3-internal.h b/hw/arm/smmuv3-internal.h
index 183d5ac8dc..6d1c1edab7 100644
--- a/hw/arm/smmuv3-internal.h
+++ b/hw/arm/smmuv3-internal.h
@@ -526,9 +526,13 @@ typedef struct CD {
 #define STE_S2TG(x)extract32((x)->word[5], 14, 2)
 #define STE_S2PS(x)extract32((x)->word[5], 16, 3)
 #define STE_S2AA64(x)  extract32((x)->word[5], 19, 1)
-#define STE_S2HD(x)extract32((x)->word[5], 24, 1)
-#define STE_S2HA(x)extract32((x)->word[5], 25, 1)
-#define STE_S2S(x) extract32((x)->word[5], 26, 1)
+#define STE_S2ENDI(x)  extract32((x)->word[5], 20, 1)
+#define STE_S2AFFD(x)  extract32((x)->word[5], 21, 1)
+#define STE_S2HD(x)extract32((x)->word[5], 23, 1)
+#define STE_S2HA(x)extract32((x)->word[5], 24, 1)
+#define STE_S2S(x) extract32((x)->word[5], 25, 1)
+#define STE_S2R(x) extract32((x)->word[5], 26, 1)
+
 #define STE_CTXPTR(x)   \
 ({  \
 unsigned long addr; \
diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 4e90343996..27840f2d66 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -33,6 +33,9 @@
 #include "smmuv3-internal.h"
 #include "smmu-internal.h"
 
+#define PTW_RECORD_FAULT(cfg)   (((cfg)->stage == 1) ? (cfg)->record_faults : \
+ (cfg)->s2cfg.record_faults)
+
 /**
  * smmuv3_trigger_irq - pulse @irq if enabled and update
  * GERROR register in case of GERROR interrupt
@@ -329,11 +332,141 @@ static int smmu_get_cd(SMMUv3State *s, STE *ste, 
uint32_t ssid,
 return 0;
 }
 
+/*
+ * Max valid value is 39 when SMMU_IDR3.STT == 0.
+ * In architectures after SMMUv3.0:
+ * - If STE.S2TG selects a 4KB or 16KB granule, the minimum valid value for 
this
+ *   field is MAX(16, 64-IAS)
+ * - If STE.S2TG selects a 64KB granule, the minimum valid value for this field
+ *   is (64-IAS).
+ * As we only support AA64, IAS = OAS.
+ */
+static bool s2t0sz_valid(SMMUTransCfg *cfg)
+{
+if (cfg->s2cfg.tsz > 39) {
+return false;
+}
+
+if (cfg->s2cfg.granule_sz == 16) {
+return (cfg->s2cfg.tsz >= 64 - oas2bits(SMMU_IDR5_OAS));
+}
+
+return (cfg->s2cfg.tsz >= MAX(64 - oas2bits(SMMU_IDR5_OAS), 16));
+}
+
+/*
+ * Return true if s2 page table config is valid.
+ * This checks with the configured start level, ias_bits and granularity we can
+ * have a valid page table as described in ARM ARM D8.2 Translation process.
+ * The idea here is to see for the highest possible number of IPA bits, how
+ * many concatenated tables we would need, if it is more than 16, then this is
+ * not possible.
+ */
+static bool s2_pgtable_config_valid(uint8_t sl0, uint8_t t0sz, uint8_t gran)
+{
+int level = get_start_level(sl0, gran);
+uint64_t ipa_bits = 64 - t0sz;
+uint64_t max_ipa = (1ULL << ipa_bits) - 1;
+int nr_concat = pgd_concat_idx(level, gran, max_ipa) + 1;
+
+return nr_concat <= VMSA_MAX_S2_CONCAT;
+}
+
+static int decode_ste_s2_cfg(SMMUTransCfg *cfg, STE *ste)
+{
+cfg->stage = 2;
+
+if (STE_S2AA64(ste) == 0x0) {
+qemu_log_mask(LOG_UNIMP,
+  "SMMUv3 AArch32 tables not supported\n");
+

[PATCH v4 10/10] hw/arm/smmuv3: Add knob to choose translation stage and enable stage-2

2023-05-16 Thread Mostafa Saleh

As everything is in place, we can use a new system property to
advertise which stage is supported and remove bad_ste from STE
stage2 config.

The property added arm-smmuv3.stage can have 3 values:
- "1": Stage-1 only is advertised.
- "2": Stage-2 only is advertised.

If not passed or an unsupported value is passed, it will default to
stage-1.

Advertise VMID16.

Don't try to decode CD, if stage-2 is configured.

Reviewed-by: Eric Auger 
Signed-off-by: Mostafa Saleh 
---
Changes in v4:
- Collected Reviewed-by tag
- Remove references to "all" stages.
Changes in v2:
- Squash knob patch with stage-2 enable patch.
- Don't try to decode CD, if stage-2 is configured.
---
 hw/arm/smmuv3.c | 32 ++--
 include/hw/arm/smmuv3.h |  1 +
 2 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/hw/arm/smmuv3.c b/hw/arm/smmuv3.c
index 17e1359be4..5c598c8428 100644
--- a/hw/arm/smmuv3.c
+++ b/hw/arm/smmuv3.c
@@ -21,6 +21,7 @@
 #include "hw/irq.h"
 #include "hw/sysbus.h"
 #include "migration/vmstate.h"
+#include "hw/qdev-properties.h"
 #include "hw/qdev-core.h"
 #include "hw/pci/pci.h"
 #include "cpu.h"
@@ -241,14 +242,17 @@ void smmuv3_record_event(SMMUv3State *s, SMMUEventInfo 
*info)
 
 static void smmuv3_init_regs(SMMUv3State *s)
 {
-/**
- * IDR0: stage1 only, AArch64 only, coherent access, 16b ASID,
- *   multi-level stream table
- */
-s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S1P, 1); /* stage 1 supported */
+/* Based on sys property, the stages supported in smmu will be 
advertised.*/
+if (s->stage && !strcmp("2", s->stage)) {
+s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S2P, 1);
+} else {
+s->idr[0] = FIELD_DP32(s->idr[0], IDR0, S1P, 1);
+}
+
 s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTF, 2); /* AArch64 PTW only */
 s->idr[0] = FIELD_DP32(s->idr[0], IDR0, COHACC, 1); /* IO coherent */
 s->idr[0] = FIELD_DP32(s->idr[0], IDR0, ASID16, 1); /* 16-bit ASID */
+s->idr[0] = FIELD_DP32(s->idr[0], IDR0, VMID16, 1); /* 16-bit VMID */
 s->idr[0] = FIELD_DP32(s->idr[0], IDR0, TTENDIAN, 2); /* little endian */
 s->idr[0] = FIELD_DP32(s->idr[0], IDR0, STALL_MODEL, 1); /* No stall */
 /* terminated transaction will always be aborted/error returned */
@@ -451,10 +455,6 @@ static int decode_ste_s2_cfg(SMMUTransCfg *cfg, STE *ste)
 goto bad_ste;
 }
 
-/* This is still here as stage 2 has not been fully enabled yet. */
-qemu_log_mask(LOG_UNIMP, "SMMUv3 does not support stage 2 yet\n");
-goto bad_ste;
-
 return 0;
 
 bad_ste:
@@ -733,7 +733,7 @@ static int smmuv3_decode_config(IOMMUMemoryRegion *mr, 
SMMUTransCfg *cfg,
 return ret;
 }
 
-if (cfg->aborted || cfg->bypassed) {
+if (cfg->aborted || cfg->bypassed || (cfg->stage == 2)) {
 return 0;
 }
 
@@ -1804,6 +1804,17 @@ static const VMStateDescription vmstate_smmuv3 = {
 }
 };
 
+static Property smmuv3_properties[] = {
+/*
+ * Stages of translation advertised.
+ * "1": Stage 1
+ * "2": Stage 2
+ * Defaults to stage 1
+ */
+DEFINE_PROP_STRING("stage", SMMUv3State, stage),
+DEFINE_PROP_END_OF_LIST()
+};
+
 static void smmuv3_instance_init(Object *obj)
 {
 /* Nothing much to do here as of now */
@@ -1820,6 +1831,7 @@ static void smmuv3_class_init(ObjectClass *klass, void 
*data)
&c->parent_phases);
 c->parent_realize = dc->realize;
 dc->realize = smmu_realize;
+device_class_set_props(dc, smmuv3_properties);
 }
 
 static int smmuv3_notify_flag_changed(IOMMUMemoryRegion *iommu,
diff --git a/include/hw/arm/smmuv3.h b/include/hw/arm/smmuv3.h
index 6031d7d325..d183a62766 100644
--- a/include/hw/arm/smmuv3.h
+++ b/include/hw/arm/smmuv3.h
@@ -62,6 +62,7 @@ struct SMMUv3State {
 
 qemu_irq irq[4];
 QemuMutex mutex;
+char *stage;
 };
 
 typedef enum {
-- 
2.40.1.606.ga4b1b128d6-goog

[PATCH v4 00/10] Add stage-2 translation for SMMUv3

2023-05-16 Thread Mostafa Saleh

This patch series adds stage-2 translation support for SMMUv3. It is
controlled by a new system property “arm-smmuv3.stage”.
- When set to “1”: Stage-1 only would be advertised and supported (default
behaviour)
- When set to “2”: Stage-2 only would be advertised and supported.

Features implemented in stage-2 are mostly synonymous with stage-1
- VMID16.
- Only AArch64 translation tables are supported.
- Only little endian translation table supported.
- Stall is not supported.
- HTTU is not supported, SW is expected to maintain the Access flag.

To make it easy to support nesting, a new structure(SMMUS2Cfg) is
embedded within SMMUTransCfg, to hold stage-2 configuration.

TLBs were updated to support VMID, where when stage-2 is used ASID is
set to -1 and ignored and when stage-1 is used VMID is set to -1 and
ignored.
As only one stage is supported at a time at the moment, TLB will
represent IPA=>PA translation with proper attributes(granularity and
t0sz) parsed from STEs for stage-2, and will represent VA=>PA
translation with proper attributes parsed from the CDs for stage-1.

New commands where added that are used with stage-2
- CMD_TLBI_S12_VMALL: Invalidate all translations for a VMID.
- CMD_TLBI_S2_IPA: Invalidate stage-2 by VMID and IPA
Some commands are illegal to be used from stage-2 were modified to
return CERROR_ILL.

This patch series can be used to run Linux pKVM SMMUv3 patches (currently on 
the list)
which controls stage-2 (from EL2) while providing a paravirtualized
interface the host(EL1)
https://lore.kernel.org/kvmarm/20230201125328.2186498-1-jean-phili...@linaro.org/

Looking forward, nesting is the next feature to go for, here are some
thoughts about this:

- TLB would require big changes for this, we can go either for a combined
implementation or per stage one. This would affect returns from PTW and
invalidation commands.

- Stage-1 data structures should be translated by stage-2 if enabled (as
context descriptors and ttb0/ttb1)

- Translated addresses from stage-1 should be translated by stage-2 if
enabled.

- Some existing commands(as CMD_TLBI_S2_IPA, CMD_TLBI_NH_ASID …) would be
modified and some of those would be based on the design of the TLBs.

- Currently, VMID is ignored when stage-1 is used as it can’t be used with
stage-2. However when nesting is advertised VMID shouldn’t be ignored
even if stage-2 is bypassed.

Changes in v4:
- Collected Reviewed-by tags
- Add SMMU_CMD_TLBI_S12_VMALL in a block to fix compilation issue
- Simplify record fault macro
- Remove references to "all" stage

Changes in v3:
- Collected Reviewed-by tags
- Separate stage-2 record faults from stage-1
- Fix input address check in stage-2 walk
- Fix shift in STE_S2HD, STE_S2HA, STE_S2S, STE_S2R
- Add more logs for illegal configs and commands.
- Rename SMMU translation macros to VMSA as they are not part of SMMU spec
- Rename stage-2 variables and functions (iova=>ipa, ap=>s2ap, ...)
- Rename oas in SMMUS2Cfg to eff_ps
- Improve comments (mention user manuals versions, field names)

Changes in v2:
-Collected Reviewed-by tags
-Add oas to SMMUS2Cfg, and use it in PTW
-Add stage member to to SMMUPTWEventInfo to differentiate stage-1 and
 stage-2 PTW faults
-Move stage-2 knob to the last patch
-Add all STE parsing in one patch
-Pares and use S2PS and S2ENDI
-Split S2AFF patch over PTW and STE patches.
-Fix TLB aliasing issue
-Renaming and refactoring and rewording commits.
-Populate OAS based on PARANGE
-Add checks for stage-1 only commands
-Update trace events to hold translation stage, vmid when possible
-Remove feature flags for supported stages as they were redundant with IDR0


Mostafa Saleh (10):
  hw/arm/smmuv3: Add missing fields for IDR0
  hw/arm/smmuv3: Update translation config to hold stage-2
  hw/arm/smmuv3: Refactor stage-1 PTW
  hw/arm/smmuv3: Add page table walk for stage-2
  hw/arm/smmuv3: Parse STE config for stage-2
  hw/arm/smmuv3: Make TLB lookup work for stage-2
  hw/arm/smmuv3: Add VMID to TLB tagging
  hw/arm/smmuv3: Add CMDs related to stage-2
  hw/arm/smmuv3: Add stage-2 support in iova notifier
  hw/arm/smmuv3: Add knob to choose translation stage and enable stage-2

 hw/arm/smmu-common.c | 209 +---
 hw/arm/smmu-internal.h   |  37 
 hw/arm/smmuv3-internal.h |  12 +-
 hw/arm/smmuv3.c  | 357 ++-
 hw/arm/trace-events  |  14 +-
 include/hw/arm/smmu-common.h |  45 -
 include/hw/arm/smmuv3.h  |   4 +
 7 files changed, 587 insertions(+), 91 deletions(-)

-- 
2.40.1.606.ga4b1b128d6-goog

Re: [PATCH v7 1/1] arm/kvm: add support for MTE

2023-05-16 Thread Richard Henderson


On 4/28/23 02:55, Cornelia Huck wrote:

Extend the 'mte' property for the virt machine to cover KVM as
well. For KVM, we don't allocate tag memory, but instead enable the
capability.

If MTE has been enabled, we need to disable migration, as we do not
yet have a way to migrate the tags as well. Therefore, MTE will stay
off with KVM unless requested explicitly.

Signed-off-by: Cornelia Huck
---
  hw/arm/virt.c| 69 +---
  target/arm/cpu.c |  9 +++---
  target/arm/cpu.h |  4 +++
  target/arm/kvm.c | 35 ++
  target/arm/kvm64.c   |  5 
  target/arm/kvm_arm.h | 19 
  6 files changed, 107 insertions(+), 34 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH] tcg/i386: Set P_REXW in tcg_out_addi_ptr

2023-05-16 Thread Richard Henderson


On 5/16/23 13:11, Michael Tokarev wrote:

12.05.2023 20:17, Richard Henderson wrote:

The REXW bit must be set to produce a 64-bit pointer result; the
bit is disabled in 32-bit mode, so we can do this unconditionally.

Fixes: 7d9e1ee424b0 ("tcg/i386: Adjust assert in tcg_out_addi_ptr")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1592
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1642


This looks like stable-8.0 material.


Yes indeed, please.

r~

Re: [RFC PATCH] target/arm: add RAZ/WI handling for DBGDTR[TX|RX]

2023-05-16 Thread Richard Henderson


On 5/16/23 03:44, Alex Bennée wrote:

The commit b3aa2f2128 (target/arm: provide stubs for more external
debug registers) was added to handle HyperV's unconditional usage of
Debug Communications Channel. It turns out that Linux will similarly
break if you enable CONFIG_HVC_DCC "ARM JTAG DCC console".

Extend the registers we RAZ/WI set to avoid this.

Cc: Anders Roxell
Cc: Evgeny Iakovlev
Signed-off-by: Alex Bennée
---
  target/arm/debug_helper.c | 11 +--
  1 file changed, 9 insertions(+), 2 deletions(-)


Reviewed-by: Richard Henderson 

r~

Re: [PATCH] tcg/i386: Set P_REXW in tcg_out_addi_ptr

2023-05-16 Thread Michael Tokarev


12.05.2023 20:17, Richard Henderson wrote:

The REXW bit must be set to produce a 64-bit pointer result; the
bit is disabled in 32-bit mode, so we can do this unconditionally.

Fixes: 7d9e1ee424b0 ("tcg/i386: Adjust assert in tcg_out_addi_ptr")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1592
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1642


This looks like stable-8.0 material.

Re: [PATCH] target/arm: allow DC CVA[D]P in user mode emulation

2023-05-16 Thread Richard Henderson


On 5/15/23 20:59, Zhuojia Shen wrote:

DC CVAP and DC CVADP instructions can be executed in EL0 on Linux,
either directly when SCTLR_EL1.UCI == 1 or emulated by the kernel (see
user_cache_maint_handler() in arch/arm64/kernel/traps.c).  The Arm ARM
documents the semantics of the two instructions that they behave as
DC CVAC if the address pointed to by their register operand is not
persistent memory.

This patch enables execution of the two instructions in user mode
emulation as NOP while preserving their original emulation in full
system virtualization.

Signed-off-by: Zhuojia Shen 
---
  target/arm/helper.c   | 26 +-
  tests/tcg/aarch64/Makefile.target | 11 
  tests/tcg/aarch64/dcpodp.c| 45 +++
  tests/tcg/aarch64/dcpop.c | 45 +++
  4 files changed, 120 insertions(+), 7 deletions(-)
  create mode 100644 tests/tcg/aarch64/dcpodp.c
  create mode 100644 tests/tcg/aarch64/dcpop.c

diff --git a/target/arm/helper.c b/target/arm/helper.c
index 0b7fd2e7e6..eeba5e7978 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -7432,23 +7432,37 @@ static void dccvap_writefn(CPUARMState *env, const 
ARMCPRegInfo *opaque,
  }
  }
  }
+#endif /*CONFIG_USER_ONLY*/
  
  static const ARMCPRegInfo dcpop_reg[] = {

  { .name = "DC_CVAP", .state = ARM_CP_STATE_AA64,
.opc0 = 1, .opc1 = 3, .crn = 7, .crm = 12, .opc2 = 1,
-  .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+  .access = PL0_W,
.fgt = FGT_DCCVAP,
-  .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
+  .accessfn = aa64_cacheop_poc_access,
+#ifdef CONFIG_USER_ONLY
+  .type = ARM_CP_NOP,
+#else
+  .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+  .writefn = dccvap_writefn,
+#endif
+},
  };


Not quite correct, as CVAP to an unmapped address should SIGSEGV.  That'll be done by the 
probe_read within dccvap_writefn.


Need to make dccvap_writefn always present, ifdef out only the memory_region_from_host + 
memory_region_writeback from there.  Need to set SCTLR_EL1.UCI in arm_cpu_reset_hold in 
the CONFIG_USER_ONLY block.



r~

  
  static const ARMCPRegInfo dcpodp_reg[] = {

  { .name = "DC_CVADP", .state = ARM_CP_STATE_AA64,
.opc0 = 1, .opc1 = 3, .crn = 7, .crm = 13, .opc2 = 1,
-  .access = PL0_W, .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+  .access = PL0_W,
.fgt = FGT_DCCVADP,
-  .accessfn = aa64_cacheop_poc_access, .writefn = dccvap_writefn },
+  .accessfn = aa64_cacheop_poc_access,
+#ifdef CONFIG_USER_ONLY
+  .type = ARM_CP_NOP,
+#else
+  .type = ARM_CP_NO_RAW | ARM_CP_SUPPRESS_TB_END,
+  .writefn = dccvap_writefn,
+#endif
+},
  };
-#endif /*CONFIG_USER_ONLY*/
  
  static CPAccessResult access_aa64_tid5(CPUARMState *env, const ARMCPRegInfo *ri,

 bool isread)
@@ -9092,7 +9106,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
  if (cpu_isar_feature(aa64_tlbios, cpu)) {
  define_arm_cp_regs(cpu, tlbios_reginfo);
  }
-#ifndef CONFIG_USER_ONLY
  /* Data Cache clean instructions up to PoP */
  if (cpu_isar_feature(aa64_dcpop, cpu)) {
  define_one_arm_cp_reg(cpu, dcpop_reg);
@@ -9101,7 +9114,6 @@ void register_cp_regs_for_features(ARMCPU *cpu)
  define_one_arm_cp_reg(cpu, dcpodp_reg);
  }
  }
-#endif /*CONFIG_USER_ONLY*/
  
  /*

   * If full MTE is enabled, add all of the system registers.
diff --git a/tests/tcg/aarch64/Makefile.target 
b/tests/tcg/aarch64/Makefile.target
index 0315795487..3430fd3cd8 100644
--- a/tests/tcg/aarch64/Makefile.target
+++ b/tests/tcg/aarch64/Makefile.target
@@ -21,12 +21,23 @@ config-cc.mak: Makefile
$(quiet-@)( \
$(call cc-option,-march=armv8.1-a+sve,  CROSS_CC_HAS_SVE); \
$(call cc-option,-march=armv8.1-a+sve2, CROSS_CC_HAS_SVE2); 
\
+   $(call cc-option,-march=armv8.2-a,  
CROSS_CC_HAS_ARMV8_2); \
$(call cc-option,-march=armv8.3-a,  
CROSS_CC_HAS_ARMV8_3); \
+   $(call cc-option,-march=armv8.5-a,  
CROSS_CC_HAS_ARMV8_5); \
$(call cc-option,-mbranch-protection=standard,  
CROSS_CC_HAS_ARMV8_BTI); \
$(call cc-option,-march=armv8.5-a+memtag,   
CROSS_CC_HAS_ARMV8_MTE); \
$(call cc-option,-march=armv9-a+sme,
CROSS_CC_HAS_ARMV9_SME)) 3> config-cc.mak
  -include config-cc.mak
  
+ifneq ($(CROSS_CC_HAS_ARMV8_2),)

+AARCH64_TESTS += dcpop
+dcpop: CFLAGS += -march=armv8.2-a
+endif
+ifneq ($(CROSS_CC_HAS_ARMV8_5),)
+AARCH64_TESTS += dcpodp
+dcpodp: CFLAGS += -march=armv8.5-a
+endif
+
  # Pauth Tests
  ifneq ($(CROSS_CC_HAS_ARMV8_3),)
  AARCH64_TESTS += pauth-1 pauth-2 pauth-4 pauth-5
diff --git a/tests/tcg/aarch64/dcpodp.c b/tests/tcg/aarch64/dcpodp.c
new file mode 100644
index 00..dad61ce78c
--- /dev/null
+++

[PULL 26/80] tcg/sparc64: Split out tcg_out_movi_s32

2023-05-16 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/sparc64/tcg-target.c.inc | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc
index e244209890..4375a06377 100644
--- a/tcg/sparc64/tcg-target.c.inc
+++ b/tcg/sparc64/tcg-target.c.inc
@@ -405,6 +405,13 @@ static void tcg_out_movi_s13(TCGContext *s, TCGReg ret, 
int32_t arg)
 tcg_out_arithi(s, ret, TCG_REG_G0, arg, ARITH_OR);
 }
 
+/* A 32-bit constant sign-extended to 64 bits.  */
+static void tcg_out_movi_s32(TCGContext *s, TCGReg ret, int32_t arg)
+{
+tcg_out_sethi(s, ret, ~arg);
+tcg_out_arithi(s, ret, ret, (arg & 0x3ff) | -0x400, ARITH_XOR);
+}
+
 /* A 32-bit constant zero-extended to 64 bits.  */
 static void tcg_out_movi_u32(TCGContext *s, TCGReg ret, uint32_t arg)
 {
@@ -444,8 +451,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 
 /* A 32-bit constant sign-extended to 64-bits.  */
 if (arg == lo) {
-tcg_out_sethi(s, ret, ~arg);
-tcg_out_arithi(s, ret, ret, (arg & 0x3ff) | -0x400, ARITH_XOR);
+tcg_out_movi_s32(s, ret, arg);
 return;
 }
 
-- 
2.34.1

[PULL 23/80] tcg/sparc64: Rename tcg_out_movi_imm13 to tcg_out_movi_s13

2023-05-16 Thread Richard Henderson

Emphasize that the constant is signed.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/sparc64/tcg-target.c.inc | 21 +++--
 1 file changed, 11 insertions(+), 10 deletions(-)

diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc
index 64464ab363..15d6a9fd73 100644
--- a/tcg/sparc64/tcg-target.c.inc
+++ b/tcg/sparc64/tcg-target.c.inc
@@ -399,7 +399,8 @@ static void tcg_out_sethi(TCGContext *s, TCGReg ret, 
uint32_t arg)
 tcg_out32(s, SETHI | INSN_RD(ret) | ((arg & 0xfc00) >> 10));
 }
 
-static void tcg_out_movi_imm13(TCGContext *s, TCGReg ret, int32_t arg)
+/* A 13-bit constant sign-extended to 64 bits.  */
+static void tcg_out_movi_s13(TCGContext *s, TCGReg ret, int32_t arg)
 {
 tcg_out_arithi(s, ret, TCG_REG_G0, arg, ARITH_OR);
 }
@@ -408,7 +409,7 @@ static void tcg_out_movi_imm32(TCGContext *s, TCGReg ret, 
int32_t arg)
 {
 if (check_fit_i32(arg, 13)) {
 /* A 13-bit constant sign-extended to 64-bits.  */
-tcg_out_movi_imm13(s, ret, arg);
+tcg_out_movi_s13(s, ret, arg);
 } else {
 /* A 32-bit constant zero-extended to 64 bits.  */
 tcg_out_sethi(s, ret, arg);
@@ -433,7 +434,7 @@ static void tcg_out_movi_int(TCGContext *s, TCGType type, 
TCGReg ret,
 
 /* A 13-bit constant sign-extended to 64-bits.  */
 if (check_fit_tl(arg, 13)) {
-tcg_out_movi_imm13(s, ret, arg);
+tcg_out_movi_s13(s, ret, arg);
 return;
 }
 
@@ -767,7 +768,7 @@ static void tcg_out_setcond_i32(TCGContext *s, TCGCond 
cond, TCGReg ret,
 
 default:
 tcg_out_cmp(s, c1, c2, c2const);
-tcg_out_movi_imm13(s, ret, 0);
+tcg_out_movi_s13(s, ret, 0);
 tcg_out_movcc(s, cond, MOVCC_ICC, ret, 1, 1);
 return;
 }
@@ -803,11 +804,11 @@ static void tcg_out_setcond_i64(TCGContext *s, TCGCond 
cond, TCGReg ret,
 /* For 64-bit signed comparisons vs zero, we can avoid the compare
if the input does not overlap the output.  */
 if (c2 == 0 && !is_unsigned_cond(cond) && c1 != ret) {
-tcg_out_movi_imm13(s, ret, 0);
+tcg_out_movi_s13(s, ret, 0);
 tcg_out_movr(s, cond, ret, c1, 1, 1);
 } else {
 tcg_out_cmp(s, c1, c2, c2const);
-tcg_out_movi_imm13(s, ret, 0);
+tcg_out_movi_s13(s, ret, 0);
 tcg_out_movcc(s, cond, MOVCC_XCC, ret, 1, 1);
 }
 }
@@ -844,7 +845,7 @@ static void tcg_out_addsub2_i64(TCGContext *s, TCGReg rl, 
TCGReg rh,
 if (use_vis3_instructions && !is_sub) {
 /* Note that ADDXC doesn't accept immediates.  */
 if (bhconst && bh != 0) {
-   tcg_out_movi_imm13(s, TCG_REG_T2, bh);
+   tcg_out_movi_s13(s, TCG_REG_T2, bh);
bh = TCG_REG_T2;
 }
 tcg_out_arith(s, rh, ah, bh, ARITH_ADDXC);
@@ -866,7 +867,7 @@ static void tcg_out_addsub2_i64(TCGContext *s, TCGReg rl, 
TCGReg rh,
  * so the adjustment fits 12 bits.
  */
 if (bhconst) {
-tcg_out_movi_imm13(s, TCG_REG_T2, bh + (is_sub ? -1 : 1));
+tcg_out_movi_s13(s, TCG_REG_T2, bh + (is_sub ? -1 : 1));
 } else {
 tcg_out_arithi(s, TCG_REG_T2, bh, 1,
is_sub ? ARITH_SUB : ARITH_ADD);
@@ -1036,7 +1037,7 @@ static void tcg_target_qemu_prologue(TCGContext *s)
 tcg_code_gen_epilogue = tcg_splitwx_to_rx(s->code_ptr);
 tcg_out_arithi(s, TCG_REG_G0, TCG_REG_I7, 8, RETURN);
 /* delay slot */
-tcg_out_movi_imm13(s, TCG_REG_O0, 0);
+tcg_out_movi_s13(s, TCG_REG_O0, 0);
 
 build_trampolines(s);
 }
@@ -1430,7 +1431,7 @@ static void tcg_out_exit_tb(TCGContext *s, uintptr_t a0)
 {
 if (check_fit_ptr(a0, 13)) {
 tcg_out_arithi(s, TCG_REG_G0, TCG_REG_I7, 8, RETURN);
-tcg_out_movi_imm13(s, TCG_REG_O0, a0);
+tcg_out_movi_s13(s, TCG_REG_O0, a0);
 return;
 } else {
 intptr_t tb_diff = tcg_tbrel_diff(s, (void *)a0);
-- 
2.34.1

[PULL 52/80] tcg/s390x: Support 128-bit load/store

2023-05-16 Thread Richard Henderson

Use LPQ/STPQ when 16-byte atomicity is required.
Note that these instructions require 16-byte alignment.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/s390x/tcg-target-con-set.h |   2 +
 tcg/s390x/tcg-target.h |   2 +-
 tcg/s390x/tcg-target.c.inc | 103 -
 3 files changed, 103 insertions(+), 4 deletions(-)

diff --git a/tcg/s390x/tcg-target-con-set.h b/tcg/s390x/tcg-target-con-set.h
index ecc079bb6d..cbad91b2b5 100644
--- a/tcg/s390x/tcg-target-con-set.h
+++ b/tcg/s390x/tcg-target-con-set.h
@@ -14,6 +14,7 @@ C_O0_I2(r, r)
 C_O0_I2(r, ri)
 C_O0_I2(r, rA)
 C_O0_I2(v, r)
+C_O0_I3(o, m, r)
 C_O1_I1(r, r)
 C_O1_I1(v, r)
 C_O1_I1(v, v)
@@ -36,6 +37,7 @@ C_O1_I2(v, v, v)
 C_O1_I3(v, v, v, v)
 C_O1_I4(r, r, ri, rI, r)
 C_O1_I4(r, r, rA, rI, r)
+C_O2_I1(o, m, r)
 C_O2_I2(o, m, 0, r)
 C_O2_I2(o, m, r, r)
 C_O2_I3(o, m, 0, 1, r)
diff --git a/tcg/s390x/tcg-target.h b/tcg/s390x/tcg-target.h
index 170007bea5..ec96952172 100644
--- a/tcg/s390x/tcg-target.h
+++ b/tcg/s390x/tcg-target.h
@@ -140,7 +140,7 @@ extern uint64_t s390_facilities[3];
 #define TCG_TARGET_HAS_muluh_i64  0
 #define TCG_TARGET_HAS_mulsh_i64  0
 
-#define TCG_TARGET_HAS_qemu_ldst_i128 0
+#define TCG_TARGET_HAS_qemu_ldst_i128 1
 
 #define TCG_TARGET_HAS_v64HAVE_FACILITY(VECTOR)
 #define TCG_TARGET_HAS_v128   HAVE_FACILITY(VECTOR)
diff --git a/tcg/s390x/tcg-target.c.inc b/tcg/s390x/tcg-target.c.inc
index 8e34b214fc..835daa51fa 100644
--- a/tcg/s390x/tcg-target.c.inc
+++ b/tcg/s390x/tcg-target.c.inc
@@ -243,6 +243,7 @@ typedef enum S390Opcode {
 RXY_LLGF= 0xe316,
 RXY_LLGH= 0xe391,
 RXY_LMG = 0xeb04,
+RXY_LPQ = 0xe38f,
 RXY_LRV = 0xe31e,
 RXY_LRVG= 0xe30f,
 RXY_LRVH= 0xe31f,
@@ -253,6 +254,7 @@ typedef enum S390Opcode {
 RXY_STG = 0xe324,
 RXY_STHY= 0xe370,
 RXY_STMG= 0xeb24,
+RXY_STPQ= 0xe38e,
 RXY_STRV= 0xe33e,
 RXY_STRVG   = 0xe32f,
 RXY_STRVH   = 0xe33f,
@@ -1577,7 +1579,18 @@ typedef struct {
 
 bool tcg_target_has_memory_bswap(MemOp memop)
 {
-return true;
+TCGAtomAlign aa;
+
+if ((memop & MO_SIZE) <= MO_64) {
+return true;
+}
+
+/*
+ * Reject 16-byte memop with 16-byte atomicity,
+ * but do allow a pair of 64-bit operations.
+ */
+aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
+return aa.atom <= MO_64;
 }
 
 static void tcg_out_qemu_ld_direct(TCGContext *s, MemOp opc, TCGReg data,
@@ -1734,13 +1747,13 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 {
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
+MemOp s_bits = opc & MO_SIZE;
 unsigned a_mask;
 
-h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
+h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, s_bits == MO_128);
 a_mask = (1 << h->aa.align) - 1;
 
 #ifdef CONFIG_SOFTMMU
-unsigned s_bits = opc & MO_SIZE;
 unsigned s_mask = (1 << s_bits) - 1;
 int mem_index = get_mmuidx(oi);
 int fast_off = TLB_MASK_TABLE_OFS(mem_index);
@@ -1865,6 +1878,80 @@ static void tcg_out_qemu_st(TCGContext* s, TCGReg 
data_reg, TCGReg addr_reg,
 }
 }
 
+static void tcg_out_qemu_ldst_i128(TCGContext *s, TCGReg datalo, TCGReg datahi,
+   TCGReg addr_reg, MemOpIdx oi, bool is_ld)
+{
+TCGLabel *l1 = NULL, *l2 = NULL;
+TCGLabelQemuLdst *ldst;
+HostAddress h;
+bool need_bswap;
+bool use_pair;
+S390Opcode insn;
+
+ldst = prepare_host_addr(s, &h, addr_reg, oi, is_ld);
+
+use_pair = h.aa.atom < MO_128;
+need_bswap = get_memop(oi) & MO_BSWAP;
+
+if (!use_pair) {
+/*
+ * Atomicity requires we use LPQ.  If we've already checked for
+ * 16-byte alignment, that's all we need.  If we arrive with
+ * lesser alignment, we have determined that less than 16-byte
+ * alignment can be satisfied with two 8-byte loads.
+ */
+if (h.aa.align < MO_128) {
+use_pair = true;
+l1 = gen_new_label();
+l2 = gen_new_label();
+
+tcg_out_insn(s, RI, TMLL, addr_reg, 15);
+tgen_branch(s, 7, l1); /* CC in {1,2,3} */
+}
+
+tcg_debug_assert(!need_bswap);
+tcg_debug_assert(datalo & 1);
+tcg_debug_assert(datahi == datalo - 1);
+insn = is_ld ? RXY_LPQ : RXY_STPQ;
+tcg_out_insn_RXY(s, insn, datahi, h.base, h.index, h.disp);
+
+if (use_pair) {
+tgen_branch(s, S390_CC_ALWAYS, l2);
+tcg_out_label(s, l1);
+}
+}
+if (use_pair) {
+TCGReg d1, d2;
+
+if (need_bswap) {
+d1 = datalo, d2 = datahi;
+insn = is_ld ? RXY_LRVG : RXY_STRVG;
+} else {
+d1 = datahi, d2 = datalo;
+insn = is_ld ? RXY_LG : RXY_STG;
+}
+
+if (h.base == d1 || h.index

[PULL 47/80] tcg/i386: Honor 64-bit atomicity in 32-bit mode

2023-05-16 Thread Richard Henderson

Use the fpu to perform 64-bit loads and stores.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.c.inc | 44 +--
 1 file changed, 38 insertions(+), 6 deletions(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 3b8528e332..0415ca2a4c 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -468,6 +468,10 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 #define OPC_GRP5(0xff)
 #define OPC_GRP14   (0x73 | P_EXT | P_DATA16)
 
+#define OPC_ESCDF   (0xdf)
+#define ESCDF_FILD_m64  5
+#define ESCDF_FISTP_m64 7
+
 /* Group 1 opcode extensions for 0x80-0x83.
These are also used as modifiers for OPC_ARITH.  */
 #define ARITH_ADD 0
@@ -2086,7 +2090,20 @@ static void tcg_out_qemu_ld_direct(TCGContext *s, TCGReg 
datalo, TCGReg datahi,
 datalo = datahi;
 datahi = t;
 }
-if (h.base == datalo || h.index == datalo) {
+if (h.aa.atom == MO_64) {
+/*
+ * Atomicity requires that we use use a single 8-byte load.
+ * For simplicity and code size, always use the FPU for this.
+ * Similar insns using SSE/AVX are merely larger.
+ * Load from memory in one go, then store back to the stack,
+ * from whence we can load into the correct integer regs.
+ */
+tcg_out_modrm_sib_offset(s, OPC_ESCDF + h.seg, ESCDF_FILD_m64,
+ h.base, h.index, 0, h.ofs);
+tcg_out_modrm_offset(s, OPC_ESCDF, ESCDF_FISTP_m64, TCG_REG_ESP, 
0);
+tcg_out_modrm_offset(s, movop, datalo, TCG_REG_ESP, 0);
+tcg_out_modrm_offset(s, movop, datahi, TCG_REG_ESP, 4);
+} else if (h.base == datalo || h.index == datalo) {
 tcg_out_modrm_sib_offset(s, OPC_LEA, datahi,
  h.base, h.index, 0, h.ofs);
 tcg_out_modrm_offset(s, movop + h.seg, datalo, datahi, 0);
@@ -2156,12 +2173,27 @@ static void tcg_out_qemu_st_direct(TCGContext *s, 
TCGReg datalo, TCGReg datahi,
 if (TCG_TARGET_REG_BITS == 64) {
 tcg_out_modrm_sib_offset(s, movop + P_REXW + h.seg, datalo,
  h.base, h.index, 0, h.ofs);
+break;
+}
+if (use_movbe) {
+TCGReg t = datalo;
+datalo = datahi;
+datahi = t;
+}
+if (h.aa.atom == MO_64) {
+/*
+ * Atomicity requires that we use use one 8-byte store.
+ * For simplicity, and code size, always use the FPU for this.
+ * Similar insns using SSE/AVX are merely larger.
+ * Assemble the 8-byte quantity in required endianness
+ * on the stack, load to coproc unit, and store.
+ */
+tcg_out_modrm_offset(s, movop, datalo, TCG_REG_ESP, 0);
+tcg_out_modrm_offset(s, movop, datahi, TCG_REG_ESP, 4);
+tcg_out_modrm_offset(s, OPC_ESCDF, ESCDF_FILD_m64, TCG_REG_ESP, 0);
+tcg_out_modrm_sib_offset(s, OPC_ESCDF + h.seg, ESCDF_FISTP_m64,
+ h.base, h.index, 0, h.ofs);
 } else {
-if (use_movbe) {
-TCGReg t = datalo;
-datalo = datahi;
-datahi = t;
-}
 tcg_out_modrm_sib_offset(s, movop + h.seg, datalo,
  h.base, h.index, 0, h.ofs);
 tcg_out_modrm_sib_offset(s, movop + h.seg, datahi,
-- 
2.34.1

[PULL 51/80] tcg/ppc: Support 128-bit load/store

2023-05-16 Thread Richard Henderson

Use LQ/STQ with ISA v2.07, and 16-byte atomicity is required.
Note that these instructions do not require 16-byte alignment.

Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target-con-set.h |   2 +
 tcg/ppc/tcg-target-con-str.h |   1 +
 tcg/ppc/tcg-target.h |   3 +-
 tcg/ppc/tcg-target.c.inc | 115 +++
 4 files changed, 108 insertions(+), 13 deletions(-)

diff --git a/tcg/ppc/tcg-target-con-set.h b/tcg/ppc/tcg-target-con-set.h
index f206b29205..bbd7b21247 100644
--- a/tcg/ppc/tcg-target-con-set.h
+++ b/tcg/ppc/tcg-target-con-set.h
@@ -14,6 +14,7 @@ C_O0_I2(r, r)
 C_O0_I2(r, ri)
 C_O0_I2(v, r)
 C_O0_I3(r, r, r)
+C_O0_I3(o, m, r)
 C_O0_I4(r, r, ri, ri)
 C_O0_I4(r, r, r, r)
 C_O1_I1(r, r)
@@ -34,6 +35,7 @@ C_O1_I3(v, v, v, v)
 C_O1_I4(r, r, ri, rZ, rZ)
 C_O1_I4(r, r, r, ri, ri)
 C_O2_I1(r, r, r)
+C_O2_I1(o, m, r)
 C_O2_I2(r, r, r, r)
 C_O2_I4(r, r, rI, rZM, r, r)
 C_O2_I4(r, r, r, r, rI, rZM)
diff --git a/tcg/ppc/tcg-target-con-str.h b/tcg/ppc/tcg-target-con-str.h
index 094613cbcb..20846901de 100644
--- a/tcg/ppc/tcg-target-con-str.h
+++ b/tcg/ppc/tcg-target-con-str.h
@@ -9,6 +9,7 @@
  * REGS(letter, register_mask)
  */
 REGS('r', ALL_GENERAL_REGS)
+REGS('o', ALL_GENERAL_REGS & 0xu)  /* odd registers */
 REGS('v', ALL_VECTOR_REGS)
 
 /*
diff --git a/tcg/ppc/tcg-target.h b/tcg/ppc/tcg-target.h
index 0914380bd7..204b70f86a 100644
--- a/tcg/ppc/tcg-target.h
+++ b/tcg/ppc/tcg-target.h
@@ -149,7 +149,8 @@ extern bool have_vsx;
 #define TCG_TARGET_HAS_mulsh_i641
 #endif
 
-#define TCG_TARGET_HAS_qemu_ldst_i128   0
+#define TCG_TARGET_HAS_qemu_ldst_i128   \
+(TCG_TARGET_REG_BITS == 64 && have_isa_2_07)
 
 /*
  * While technically Altivec could support V64, it has no 64-bit store
diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index b5c49895f3..c3a1527856 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -295,25 +295,27 @@ static bool tcg_target_const_match(int64_t val, TCGType 
type, int ct)
 
 #define B  OPCD( 18)
 #define BC OPCD( 16)
+
 #define LBZOPCD( 34)
 #define LHZOPCD( 40)
 #define LHAOPCD( 42)
 #define LWZOPCD( 32)
 #define LWZUX  XO31( 55)
-#define STBOPCD( 38)
-#define STHOPCD( 44)
-#define STWOPCD( 36)
-
-#define STDXO62(  0)
-#define STDU   XO62(  1)
-#define STDX   XO31(149)
-
 #define LD XO58(  0)
 #define LDXXO31( 21)
 #define LDUXO58(  1)
 #define LDUX   XO31( 53)
 #define LWAXO58(  2)
 #define LWAX   XO31(341)
+#define LQ OPCD( 56)
+
+#define STBOPCD( 38)
+#define STHOPCD( 44)
+#define STWOPCD( 36)
+#define STDXO62(  0)
+#define STDU   XO62(  1)
+#define STDX   XO31(149)
+#define STQXO62(  2)
 
 #define ADDIC  OPCD( 12)
 #define ADDI   OPCD( 14)
@@ -2020,7 +2022,18 @@ typedef struct {
 
 bool tcg_target_has_memory_bswap(MemOp memop)
 {
-return true;
+TCGAtomAlign aa;
+
+if ((memop & MO_SIZE) <= MO_64) {
+return true;
+}
+
+/*
+ * Reject 16-byte memop with 16-byte atomicity,
+ * but do allow a pair of 64-bit operations.
+ */
+aa = atom_and_align_for_opc(tcg_ctx, memop, MO_ATOM_IFALIGN, true);
+return aa.atom <= MO_64;
 }
 
 /*
@@ -2035,7 +2048,7 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 {
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
-MemOp a_bits;
+MemOp a_bits, s_bits;
 
 /*
  * Book II, Section 1.4, Single-Copy Atomicity, specifies:
@@ -2047,10 +2060,11 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
  * As of 3.0, "the non-atomic access is performed as described in
  * the corresponding list", which matches MO_ATOM_SUBALIGN.
  */
+s_bits = opc & MO_SIZE;
 h->aa = atom_and_align_for_opc(s, opc,
have_isa_3_00 ? MO_ATOM_SUBALIGN
  : MO_ATOM_IFALIGN,
-   false);
+   s_bits == MO_128);
 a_bits = h->aa.align;
 
 #ifdef CONFIG_SOFTMMU
@@ -2060,7 +2074,6 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 int fast_off = TLB_MASK_TABLE_OFS(mem_index);
 int mask_off = fast_off + offsetof(CPUTLBDescFast, mask);
 int table_off = fast_off + offsetof(CPUTLBDescFast, table);
-unsigned s_bits = opc & MO_SIZE;
 
 ldst = new_ldst_label(s);
 ldst->is_ld = is_ld;
@@ -2303,6 +2316,70 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg 
datalo, TCGReg datahi,
 }
 }
 
+static TCGLabelQemuLdst *
+prepare_host_addr_index_only(TCGContext *s, HostAddress *h, TCGReg addr_reg,
+ MemOpIdx oi, bool is_ld)
+{
+TCGLabelQemuLdst *ldst;
+
+ldst = prepare_host_addr(s, h, addr_reg, -1, oi, true);
+
+/* Compose the final address, as LQ/STQ have no indexing. */
+if (h->base != 0) {
+tcg_o

[PULL 80/80] tcg: Split out exec/user/guest-base.h

2023-05-16 Thread Richard Henderson

TCG will need this declaration, without all of the other
bits that come with cpu-all.h.

Reviewed-by: Thomas Huth 
Signed-off-by: Richard Henderson 
---
 include/exec/cpu-all.h |  5 +
 include/exec/user/guest-base.h | 12 
 tcg/tcg.c  |  3 +++
 3 files changed, 16 insertions(+), 4 deletions(-)
 create mode 100644 include/exec/user/guest-base.h

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index ad824fee52..78d258af44 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -84,11 +84,8 @@
 
 #if defined(CONFIG_USER_ONLY)
 #include "exec/user/abitypes.h"
+#include "exec/user/guest-base.h"
 
-/* On some host systems the guest address space is reserved on the host.
- * This allows the guest address space to be offset to a convenient location.
- */
-extern uintptr_t guest_base;
 extern bool have_guest_base;
 
 /*
diff --git a/include/exec/user/guest-base.h b/include/exec/user/guest-base.h
new file mode 100644
index 00..afe2ab7fbb
--- /dev/null
+++ b/include/exec/user/guest-base.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: LGPL-2.1-or-later */
+/*
+ * Declaration of guest_base.
+ *  Copyright (c) 2003 Fabrice Bellard
+ */
+
+#ifndef EXEC_USER_GUEST_BASE_H
+#define EXEC_USER_GUEST_BASE_H
+
+extern uintptr_t guest_base;
+
+#endif
diff --git a/tcg/tcg.c b/tcg/tcg.c
index 4bd598c18b..6735d3f08d 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -63,6 +63,9 @@
 #include "tcg/tcg-temp-internal.h"
 #include "tcg-internal.h"
 #include "accel/tcg/perf.h"
+#ifdef CONFIG_USER_ONLY
+#include "exec/user/guest-base.h"
+#endif
 
 /* Forward declarations for functions declared in tcg-target.c.inc and
used here. */
-- 
2.34.1

[PULL 55/80] accel/tcg: Widen tcg-ldst.h addresses to uint64_t

2023-05-16 Thread Richard Henderson

Always pass the target address as uint64_t.
Adjust tcg_out_{ld,st}_helper_args to match.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-ldst.h | 26 +-
 accel/tcg/cputlb.c | 26 +-
 accel/tcg/user-exec.c  | 26 +-
 tcg/tcg.c  | 62 --
 4 files changed, 87 insertions(+), 53 deletions(-)

diff --git a/include/tcg/tcg-ldst.h b/include/tcg/tcg-ldst.h
index 7dd57013e9..6ccfe9131d 100644
--- a/include/tcg/tcg-ldst.h
+++ b/include/tcg/tcg-ldst.h
@@ -26,38 +26,38 @@
 #define TCG_LDST_H
 
 /* Value zero-extended to tcg register size.  */
-tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_ldub_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr);
-tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_lduw_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr);
-tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_ldul_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr);
-uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong addr,
+uint64_t helper_ldq_mmu(CPUArchState *env, uint64_t addr,
 MemOpIdx oi, uintptr_t retaddr);
-Int128 helper_ld16_mmu(CPUArchState *env, target_ulong addr,
+Int128 helper_ld16_mmu(CPUArchState *env, uint64_t addr,
MemOpIdx oi, uintptr_t retaddr);
 
 /* Value sign-extended to tcg register size.  */
-tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_ldsb_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr);
-tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_ldsw_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr);
-tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_ldsl_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr);
 
 /*
  * Value extended to at least uint32_t, so that some ABIs do not require
  * zero-extension from uint8_t or uint16_t.
  */
-void helper_stb_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+void helper_stb_mmu(CPUArchState *env, uint64_t addr, uint32_t val,
 MemOpIdx oi, uintptr_t retaddr);
-void helper_stw_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+void helper_stw_mmu(CPUArchState *env, uint64_t addr, uint32_t val,
 MemOpIdx oi, uintptr_t retaddr);
-void helper_stl_mmu(CPUArchState *env, target_ulong addr, uint32_t val,
+void helper_stl_mmu(CPUArchState *env, uint64_t addr, uint32_t val,
 MemOpIdx oi, uintptr_t retaddr);
-void helper_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
+void helper_stq_mmu(CPUArchState *env, uint64_t addr, uint64_t val,
 MemOpIdx oi, uintptr_t retaddr);
-void helper_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val,
+void helper_st16_mmu(CPUArchState *env, uint64_t addr, Int128 val,
  MemOpIdx oi, uintptr_t retaddr);
 
 #endif /* TCG_LDST_H */
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 49e49f75a4..5440f68deb 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -2367,7 +2367,7 @@ static uint8_t do_ld1_mmu(CPUArchState *env, target_ulong 
addr, MemOpIdx oi,
 return do_ld_1(env, &l.page[0], l.mmu_idx, access_type, ra);
 }
 
-tcg_target_ulong helper_ldub_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_ldub_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr)
 {
 tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_8);
@@ -2398,7 +2398,7 @@ static uint16_t do_ld2_mmu(CPUArchState *env, 
target_ulong addr, MemOpIdx oi,
 return ret;
 }
 
-tcg_target_ulong helper_lduw_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_lduw_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr)
 {
 tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_16);
@@ -2425,7 +2425,7 @@ static uint32_t do_ld4_mmu(CPUArchState *env, 
target_ulong addr, MemOpIdx oi,
 return ret;
 }
 
-tcg_target_ulong helper_ldul_mmu(CPUArchState *env, target_ulong addr,
+tcg_target_ulong helper_ldul_mmu(CPUArchState *env, uint64_t addr,
  MemOpIdx oi, uintptr_t retaddr)
 {
 tcg_debug_assert((get_memop(oi) & MO_SIZE) == MO_32);
@@ -2452,7 +2452,7 @@ static uint64_t do_ld8_mmu(CPUArchState *env, 
target_ulong addr, MemOpIdx oi,
 return ret;
 }
 
-uint64_t helper_ldq_mmu(CPUArchState *env, target_ulong add

[PULL 54/80] tcg: Widen gen_insn_data to uint64_t

2023-05-16 Thread Richard Henderson

We already pass uint64_t to restore_state_to_opc; this changes all
of the other uses from insn_start through the encoding to decoding.

Reviewed-by: Anton Johansson 
Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op.h  | 39 +--
 include/tcg/tcg-opc.h |  2 +-
 include/tcg/tcg.h | 30 +++---
 accel/tcg/translate-all.c | 28 
 tcg/tcg.c | 18 --
 5 files changed, 45 insertions(+), 72 deletions(-)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index 4401fa493c..de3b70aa84 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -723,48 +723,27 @@ static inline void tcg_gen_concat32_i64(TCGv_i64 ret, 
TCGv_i64 lo, TCGv_i64 hi)
 #endif
 
 #if TARGET_INSN_START_WORDS == 1
-# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
 static inline void tcg_gen_insn_start(target_ulong pc)
 {
-tcg_gen_op1(INDEX_op_insn_start, pc);
+TCGOp *op = tcg_emit_op(INDEX_op_insn_start, 64 / TCG_TARGET_REG_BITS);
+tcg_set_insn_start_param(op, 0, pc);
 }
-# else
-static inline void tcg_gen_insn_start(target_ulong pc)
-{
-tcg_gen_op2(INDEX_op_insn_start, (uint32_t)pc, (uint32_t)(pc >> 32));
-}
-# endif
 #elif TARGET_INSN_START_WORDS == 2
-# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
 static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1)
 {
-tcg_gen_op2(INDEX_op_insn_start, pc, a1);
+TCGOp *op = tcg_emit_op(INDEX_op_insn_start, 2 * 64 / TCG_TARGET_REG_BITS);
+tcg_set_insn_start_param(op, 0, pc);
+tcg_set_insn_start_param(op, 1, a1);
 }
-# else
-static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1)
-{
-tcg_gen_op4(INDEX_op_insn_start,
-(uint32_t)pc, (uint32_t)(pc >> 32),
-(uint32_t)a1, (uint32_t)(a1 >> 32));
-}
-# endif
 #elif TARGET_INSN_START_WORDS == 3
-# if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
 static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1,
   target_ulong a2)
 {
-tcg_gen_op3(INDEX_op_insn_start, pc, a1, a2);
+TCGOp *op = tcg_emit_op(INDEX_op_insn_start, 3 * 64 / TCG_TARGET_REG_BITS);
+tcg_set_insn_start_param(op, 0, pc);
+tcg_set_insn_start_param(op, 1, a1);
+tcg_set_insn_start_param(op, 2, a2);
 }
-# else
-static inline void tcg_gen_insn_start(target_ulong pc, target_ulong a1,
-  target_ulong a2)
-{
-tcg_gen_op6(INDEX_op_insn_start,
-(uint32_t)pc, (uint32_t)(pc >> 32),
-(uint32_t)a1, (uint32_t)(a1 >> 32),
-(uint32_t)a2, (uint32_t)(a2 >> 32));
-}
-# endif
 #else
 # error "Unhandled number of operands to insn_start"
 #endif
diff --git a/include/tcg/tcg-opc.h b/include/tcg/tcg-opc.h
index 94cf7c5d6a..29216366d2 100644
--- a/include/tcg/tcg-opc.h
+++ b/include/tcg/tcg-opc.h
@@ -190,7 +190,7 @@ DEF(mulsh_i64, 1, 2, 0, IMPL64 | 
IMPL(TCG_TARGET_HAS_mulsh_i64))
 #define DATA64_ARGS  (TCG_TARGET_REG_BITS == 64 ? 1 : 2)
 
 /* QEMU specific */
-DEF(insn_start, 0, 0, TLADDR_ARGS * TARGET_INSN_START_WORDS,
+DEF(insn_start, 0, 0, DATA64_ARGS * TARGET_INSN_START_WORDS,
 TCG_OPF_NOT_PRESENT)
 DEF(exit_tb, 0, 0, 1, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
 DEF(goto_tb, 0, 0, 1, TCG_OPF_BB_EXIT | TCG_OPF_BB_END)
diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h
index b19e167e1d..f40de4177d 100644
--- a/include/tcg/tcg.h
+++ b/include/tcg/tcg.h
@@ -629,7 +629,7 @@ struct TCGContext {
 TCGTemp *reg_to_temp[TCG_TARGET_NB_REGS];
 
 uint16_t gen_insn_end_off[TCG_MAX_INSNS];
-target_ulong gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
+uint64_t gen_insn_data[TCG_MAX_INSNS][TARGET_INSN_START_WORDS];
 
 /* Exit to translator on overflow. */
 sigjmp_buf jmp_trans;
@@ -771,24 +771,24 @@ static inline void tcg_set_insn_param(TCGOp *op, int arg, 
TCGArg v)
 op->args[arg] = v;
 }
 
-static inline target_ulong tcg_get_insn_start_param(TCGOp *op, int arg)
+static inline uint64_t tcg_get_insn_start_param(TCGOp *op, int arg)
 {
-#if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
-return tcg_get_insn_param(op, arg);
-#else
-return tcg_get_insn_param(op, arg * 2) |
-   ((uint64_t)tcg_get_insn_param(op, arg * 2 + 1) << 32);
-#endif
+if (TCG_TARGET_REG_BITS == 64) {
+return tcg_get_insn_param(op, arg);
+} else {
+return deposit64(tcg_get_insn_param(op, arg * 2), 32, 32,
+ tcg_get_insn_param(op, arg * 2 + 1));
+}
 }
 
-static inline void tcg_set_insn_start_param(TCGOp *op, int arg, target_ulong v)
+static inline void tcg_set_insn_start_param(TCGOp *op, int arg, uint64_t v)
 {
-#if TARGET_LONG_BITS <= TCG_TARGET_REG_BITS
-tcg_set_insn_param(op, arg, v);
-#else
-tcg_set_insn_param(op, arg * 2, v);
-tcg_set_insn_param(op, arg * 2 + 1, v >> 32);
-#endif
+if (TCG_TARGET_REG_BITS == 64) {
+tcg_set_insn_param(op, ar

[PULL 64/80] tcg: Remove TCGv from tcg_gen_qemu_{ld,st}_*

2023-05-16 Thread Richard Henderson

Expand from TCGv to TCGTemp inline in the translators,
and validate that the size matches tcg_ctx->addr_type.
These inlines will eventually be seen only by target-specific code.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 include/tcg/tcg-op.h |  50 ++-
 tcg/tcg-op-ldst.c| 343 ++-
 2 files changed, 251 insertions(+), 142 deletions(-)

diff --git a/include/tcg/tcg-op.h b/include/tcg/tcg-op.h
index de3b70aa84..e556450ba9 100644
--- a/include/tcg/tcg-op.h
+++ b/include/tcg/tcg-op.h
@@ -803,22 +803,60 @@ static inline void tcg_gen_plugin_cb_end(void)
 #define tcg_temp_new() tcg_temp_new_i32()
 #define tcg_global_mem_new tcg_global_mem_new_i32
 #define tcg_temp_free tcg_temp_free_i32
+#define tcgv_tl_temp tcgv_i32_temp
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i32
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i32
 #else
 #define tcg_temp_new() tcg_temp_new_i64()
 #define tcg_global_mem_new tcg_global_mem_new_i64
 #define tcg_temp_free tcg_temp_free_i64
+#define tcgv_tl_temp tcgv_i64_temp
 #define tcg_gen_qemu_ld_tl tcg_gen_qemu_ld_i64
 #define tcg_gen_qemu_st_tl tcg_gen_qemu_st_i64
 #endif
 
-void tcg_gen_qemu_ld_i32(TCGv_i32, TCGv, TCGArg, MemOp);
-void tcg_gen_qemu_st_i32(TCGv_i32, TCGv, TCGArg, MemOp);
-void tcg_gen_qemu_ld_i64(TCGv_i64, TCGv, TCGArg, MemOp);
-void tcg_gen_qemu_st_i64(TCGv_i64, TCGv, TCGArg, MemOp);
-void tcg_gen_qemu_ld_i128(TCGv_i128, TCGv, TCGArg, MemOp);
-void tcg_gen_qemu_st_i128(TCGv_i128, TCGv, TCGArg, MemOp);
+void tcg_gen_qemu_ld_i32_chk(TCGv_i32, TCGTemp *, TCGArg, MemOp, TCGType);
+void tcg_gen_qemu_st_i32_chk(TCGv_i32, TCGTemp *, TCGArg, MemOp, TCGType);
+void tcg_gen_qemu_ld_i64_chk(TCGv_i64, TCGTemp *, TCGArg, MemOp, TCGType);
+void tcg_gen_qemu_st_i64_chk(TCGv_i64, TCGTemp *, TCGArg, MemOp, TCGType);
+void tcg_gen_qemu_ld_i128_chk(TCGv_i128, TCGTemp *, TCGArg, MemOp, TCGType);
+void tcg_gen_qemu_st_i128_chk(TCGv_i128, TCGTemp *, TCGArg, MemOp, TCGType);
+
+static inline void
+tcg_gen_qemu_ld_i32(TCGv_i32 v, TCGv a, TCGArg i, MemOp m)
+{
+tcg_gen_qemu_ld_i32_chk(v, tcgv_tl_temp(a), i, m, TCG_TYPE_TL);
+}
+
+static inline void
+tcg_gen_qemu_st_i32(TCGv_i32 v, TCGv a, TCGArg i, MemOp m)
+{
+tcg_gen_qemu_st_i32_chk(v, tcgv_tl_temp(a), i, m, TCG_TYPE_TL);
+}
+
+static inline void
+tcg_gen_qemu_ld_i64(TCGv_i64 v, TCGv a, TCGArg i, MemOp m)
+{
+tcg_gen_qemu_ld_i64_chk(v, tcgv_tl_temp(a), i, m, TCG_TYPE_TL);
+}
+
+static inline void
+tcg_gen_qemu_st_i64(TCGv_i64 v, TCGv a, TCGArg i, MemOp m)
+{
+tcg_gen_qemu_st_i64_chk(v, tcgv_tl_temp(a), i, m, TCG_TYPE_TL);
+}
+
+static inline void
+tcg_gen_qemu_ld_i128(TCGv_i128 v, TCGv a, TCGArg i, MemOp m)
+{
+tcg_gen_qemu_ld_i128_chk(v, tcgv_tl_temp(a), i, m, TCG_TYPE_TL);
+}
+
+static inline void
+tcg_gen_qemu_st_i128(TCGv_i128 v, TCGv a, TCGArg i, MemOp m)
+{
+tcg_gen_qemu_st_i128_chk(v, tcgv_tl_temp(a), i, m, TCG_TYPE_TL);
+}
 
 void tcg_gen_atomic_cmpxchg_i32(TCGv_i32, TCGv, TCGv_i32, TCGv_i32,
 TCGArg, MemOp);
diff --git a/tcg/tcg-op-ldst.c b/tcg/tcg-op-ldst.c
index 2d5e98971d..84a03bf6ed 100644
--- a/tcg/tcg-op-ldst.c
+++ b/tcg/tcg-op-ldst.c
@@ -68,39 +68,38 @@ static inline MemOp tcg_canonicalize_memop(MemOp op, bool 
is64, bool st)
 return op;
 }
 
-static void gen_ldst_i32(TCGOpcode opc, TCGv_i32 val, TCGv addr,
- MemOp memop, TCGArg idx)
+static void gen_ldst(TCGOpcode opc, TCGTemp *vl, TCGTemp *vh,
+ TCGTemp *addr, MemOpIdx oi)
 {
-MemOpIdx oi = make_memop_idx(memop, idx);
-#if TARGET_LONG_BITS == 32
-tcg_gen_op3i_i32(opc, val, addr, oi);
-#else
-if (TCG_TARGET_REG_BITS == 32) {
-tcg_gen_op4i_i32(opc, val, TCGV_LOW(addr), TCGV_HIGH(addr), oi);
+if (TCG_TARGET_REG_BITS == 64 || tcg_ctx->addr_type == TCG_TYPE_I32) {
+if (vh) {
+tcg_gen_op4(opc, temp_arg(vl), temp_arg(vh), temp_arg(addr), oi);
+} else {
+tcg_gen_op3(opc, temp_arg(vl), temp_arg(addr), oi);
+}
 } else {
-tcg_gen_op3(opc, tcgv_i32_arg(val), tcgv_i64_arg(addr), oi);
+/* See TCGV_LOW/HIGH. */
+TCGTemp *al = addr + HOST_BIG_ENDIAN;
+TCGTemp *ah = addr + !HOST_BIG_ENDIAN;
+
+if (vh) {
+tcg_gen_op5(opc, temp_arg(vl), temp_arg(vh),
+temp_arg(al), temp_arg(ah), oi);
+} else {
+tcg_gen_op4(opc, temp_arg(vl), temp_arg(al), temp_arg(ah), oi);
+}
 }
-#endif
 }
 
-static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 val, TCGv addr,
- MemOp memop, TCGArg idx)
+static void gen_ldst_i64(TCGOpcode opc, TCGv_i64 v, TCGTemp *addr, MemOpIdx oi)
 {
-MemOpIdx oi = make_memop_idx(memop, idx);
-#if TARGET_LONG_BITS == 32
 if (TCG_TARGET_REG_BITS == 32) {
-tcg_gen_op4i_i32(opc, TCGV_LOW(val), TCGV_HIGH(val), addr, oi);
+TCGTemp *vl = tcgv_i32_temp(TCGV_LOW(v));
+TCGTemp *vh = tcgv_i32_temp(TCGV_HI

[PULL 61/80] tcg: Reduce copies for plugin_gen_mem_callbacks

2023-05-16 Thread Richard Henderson

We only need to make copies for loads, when the destination
overlaps the address.  For now, only eliminate the copy for
stores and 128-bit loads.

Rename plugin_prep_mem_callbacks to plugin_maybe_preserve_addr,
returning NULL if no copy is made.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 tcg/tcg-op-ldst.c | 38 --
 1 file changed, 20 insertions(+), 18 deletions(-)

diff --git a/tcg/tcg-op-ldst.c b/tcg/tcg-op-ldst.c
index ca57a2779d..b695d2954e 100644
--- a/tcg/tcg-op-ldst.c
+++ b/tcg/tcg-op-ldst.c
@@ -114,7 +114,8 @@ static void tcg_gen_req_mo(TCGBar type)
 }
 }
 
-static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr)
+/* Only required for loads, where value might overlap addr. */
+static TCGv plugin_maybe_preserve_addr(TCGv vaddr)
 {
 #ifdef CONFIG_PLUGIN
 if (tcg_ctx->plugin_insn != NULL) {
@@ -124,17 +125,20 @@ static inline TCGv plugin_prep_mem_callbacks(TCGv vaddr)
 return temp;
 }
 #endif
-return vaddr;
+return NULL;
 }
 
-static void plugin_gen_mem_callbacks(TCGv vaddr, MemOpIdx oi,
- enum qemu_plugin_mem_rw rw)
+static void
+plugin_gen_mem_callbacks(TCGv copy_addr, TCGv orig_addr, MemOpIdx oi,
+ enum qemu_plugin_mem_rw rw)
 {
 #ifdef CONFIG_PLUGIN
 if (tcg_ctx->plugin_insn != NULL) {
 qemu_plugin_meminfo_t info = make_plugin_meminfo(oi, rw);
-plugin_gen_empty_mem_callback(vaddr, info);
-tcg_temp_free(vaddr);
+plugin_gen_empty_mem_callback(copy_addr ? : orig_addr, info);
+if (copy_addr) {
+tcg_temp_free(copy_addr);
+}
 }
 #endif
 }
@@ -143,6 +147,7 @@ void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg 
idx, MemOp memop)
 {
 MemOp orig_memop;
 MemOpIdx oi;
+TCGv copy_addr;
 
 tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
 memop = tcg_canonicalize_memop(memop, 0, 0);
@@ -157,9 +162,9 @@ void tcg_gen_qemu_ld_i32(TCGv_i32 val, TCGv addr, TCGArg 
idx, MemOp memop)
 }
 }
 
-addr = plugin_prep_mem_callbacks(addr);
+copy_addr = plugin_maybe_preserve_addr(addr);
 gen_ldst_i32(INDEX_op_qemu_ld_i32, val, addr, memop, idx);
-plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_R);
+plugin_gen_mem_callbacks(copy_addr, addr, oi, QEMU_PLUGIN_MEM_R);
 
 if ((orig_memop ^ memop) & MO_BSWAP) {
 switch (orig_memop & MO_SIZE) {
@@ -202,13 +207,12 @@ void tcg_gen_qemu_st_i32(TCGv_i32 val, TCGv addr, TCGArg 
idx, MemOp memop)
 memop &= ~MO_BSWAP;
 }
 
-addr = plugin_prep_mem_callbacks(addr);
 if (TCG_TARGET_HAS_qemu_st8_i32 && (memop & MO_SIZE) == MO_8) {
 gen_ldst_i32(INDEX_op_qemu_st8_i32, val, addr, memop, idx);
 } else {
 gen_ldst_i32(INDEX_op_qemu_st_i32, val, addr, memop, idx);
 }
-plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_W);
+plugin_gen_mem_callbacks(NULL, addr, oi, QEMU_PLUGIN_MEM_W);
 
 if (swap) {
 tcg_temp_free_i32(swap);
@@ -219,6 +223,7 @@ void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg 
idx, MemOp memop)
 {
 MemOp orig_memop;
 MemOpIdx oi;
+TCGv copy_addr;
 
 if (TCG_TARGET_REG_BITS == 32 && (memop & MO_SIZE) < MO_64) {
 tcg_gen_qemu_ld_i32(TCGV_LOW(val), addr, idx, memop);
@@ -243,9 +248,9 @@ void tcg_gen_qemu_ld_i64(TCGv_i64 val, TCGv addr, TCGArg 
idx, MemOp memop)
 }
 }
 
-addr = plugin_prep_mem_callbacks(addr);
+copy_addr = plugin_maybe_preserve_addr(addr);
 gen_ldst_i64(INDEX_op_qemu_ld_i64, val, addr, memop, idx);
-plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_R);
+plugin_gen_mem_callbacks(copy_addr, addr, oi, QEMU_PLUGIN_MEM_R);
 
 if ((orig_memop ^ memop) & MO_BSWAP) {
 int flags = (orig_memop & MO_SIGN
@@ -300,9 +305,8 @@ void tcg_gen_qemu_st_i64(TCGv_i64 val, TCGv addr, TCGArg 
idx, MemOp memop)
 memop &= ~MO_BSWAP;
 }
 
-addr = plugin_prep_mem_callbacks(addr);
 gen_ldst_i64(INDEX_op_qemu_st_i64, val, addr, memop, idx);
-plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_W);
+plugin_gen_mem_callbacks(NULL, addr, oi, QEMU_PLUGIN_MEM_W);
 
 if (swap) {
 tcg_temp_free_i64(swap);
@@ -419,7 +423,6 @@ void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg 
idx, MemOp memop)
 tcg_debug_assert((memop & MO_SIGN) == 0);
 
 tcg_gen_req_mo(TCG_MO_LD_LD | TCG_MO_ST_LD);
-addr = plugin_prep_mem_callbacks(addr);
 
 /* TODO: For now, force 32-bit hosts to use the helper. */
 if (TCG_TARGET_HAS_qemu_ldst_i128 && TCG_TARGET_REG_BITS == 64) {
@@ -490,7 +493,7 @@ void tcg_gen_qemu_ld_i128(TCGv_i128 val, TCGv addr, TCGArg 
idx, MemOp memop)
 maybe_free_addr64(a64);
 }
 
-plugin_gen_mem_callbacks(addr, oi, QEMU_PLUGIN_MEM_R);
+plugin_gen_mem_callbacks(NULL, addr, oi, QEMU_PLUGIN_MEM_R);
 }
 
 void tcg_gen_qemu_st_i128(TCGv_i128 val, TCGv addr, TCGArg idx, MemOp memop)
@@ -501,7 +504,6 @@ void tcg_gen_qem

[PULL 59/80] accel/tcg: Merge gen_mem_wrapped with plugin_gen_empty_mem_callback

2023-05-16 Thread Richard Henderson

As gen_mem_wrapped is only used in plugin_gen_empty_mem_callback,
we can avoid the curiosity of union mem_gen_fn by inlining it.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 30 ++
 1 file changed, 6 insertions(+), 24 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 5efb8db258..04facd6305 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -202,35 +202,17 @@ static void plugin_gen_empty_callback(enum 
plugin_gen_from from)
 }
 }
 
-union mem_gen_fn {
-void (*mem_fn)(TCGv, uint32_t);
-void (*inline_fn)(void);
-};
-
-static void gen_mem_wrapped(enum plugin_gen_cb type,
-const union mem_gen_fn *f, TCGv addr,
-uint32_t info, bool is_mem)
+void plugin_gen_empty_mem_callback(TCGv addr, uint32_t info)
 {
 enum qemu_plugin_mem_rw rw = get_plugin_meminfo_rw(info);
 
-gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, type, rw);
-if (is_mem) {
-f->mem_fn(addr, info);
-} else {
-f->inline_fn();
-}
+gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, PLUGIN_GEN_CB_MEM, rw);
+gen_empty_mem_cb(addr, info);
 tcg_gen_plugin_cb_end();
-}
 
-void plugin_gen_empty_mem_callback(TCGv addr, uint32_t info)
-{
-union mem_gen_fn fn;
-
-fn.mem_fn = gen_empty_mem_cb;
-gen_mem_wrapped(PLUGIN_GEN_CB_MEM, &fn, addr, info, true);
-
-fn.inline_fn = gen_empty_inline_cb;
-gen_mem_wrapped(PLUGIN_GEN_CB_INLINE, &fn, 0, info, false);
+gen_plugin_cb_start(PLUGIN_GEN_FROM_MEM, PLUGIN_GEN_CB_INLINE, rw);
+gen_empty_inline_cb();
+tcg_gen_plugin_cb_end();
 }
 
 static TCGOp *find_op(TCGOp *op, TCGOpcode opc)
-- 
2.34.1

[PULL 60/80] accel/tcg: Merge do_gen_mem_cb into caller

2023-05-16 Thread Richard Henderson

As do_gen_mem_cb is called once, merge it into gen_empty_mem_cb.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 accel/tcg/plugin-gen.c | 39 +--
 1 file changed, 17 insertions(+), 22 deletions(-)

diff --git a/accel/tcg/plugin-gen.c b/accel/tcg/plugin-gen.c
index 04facd6305..907c5004a4 100644
--- a/accel/tcg/plugin-gen.c
+++ b/accel/tcg/plugin-gen.c
@@ -92,27 +92,6 @@ void HELPER(plugin_vcpu_mem_cb)(unsigned int vcpu_index,
 void *userdata)
 { }
 
-static void do_gen_mem_cb(TCGv vaddr, uint32_t info)
-{
-TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
-TCGv_i32 meminfo = tcg_temp_ebb_new_i32();
-TCGv_i64 vaddr64 = tcg_temp_ebb_new_i64();
-TCGv_ptr udata = tcg_temp_ebb_new_ptr();
-
-tcg_gen_movi_i32(meminfo, info);
-tcg_gen_movi_ptr(udata, 0);
-tcg_gen_ld_i32(cpu_index, cpu_env,
-   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
-tcg_gen_extu_tl_i64(vaddr64, vaddr);
-
-gen_helper_plugin_vcpu_mem_cb(cpu_index, meminfo, vaddr64, udata);
-
-tcg_temp_free_ptr(udata);
-tcg_temp_free_i64(vaddr64);
-tcg_temp_free_i32(meminfo);
-tcg_temp_free_i32(cpu_index);
-}
-
 static void gen_empty_udata_cb(void)
 {
 TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
@@ -147,7 +126,23 @@ static void gen_empty_inline_cb(void)
 
 static void gen_empty_mem_cb(TCGv addr, uint32_t info)
 {
-do_gen_mem_cb(addr, info);
+TCGv_i32 cpu_index = tcg_temp_ebb_new_i32();
+TCGv_i32 meminfo = tcg_temp_ebb_new_i32();
+TCGv_i64 addr64 = tcg_temp_ebb_new_i64();
+TCGv_ptr udata = tcg_temp_ebb_new_ptr();
+
+tcg_gen_movi_i32(meminfo, info);
+tcg_gen_movi_ptr(udata, 0);
+tcg_gen_ld_i32(cpu_index, cpu_env,
+   -offsetof(ArchCPU, env) + offsetof(CPUState, cpu_index));
+tcg_gen_extu_tl_i64(addr64, addr);
+
+gen_helper_plugin_vcpu_mem_cb(cpu_index, meminfo, addr64, udata);
+
+tcg_temp_free_ptr(udata);
+tcg_temp_free_i64(addr64);
+tcg_temp_free_i32(meminfo);
+tcg_temp_free_i32(cpu_index);
 }
 
 /*
-- 
2.34.1

Re: [PATCH v2] piix: fix regression during unplug in Xen HVM domUs

2023-05-16 Thread Olaf Hering

Am Tue, 16 May 2023 13:38:42 -0400
schrieb John Snow :

> I haven't touched IDE or block code in quite a long while now -- I
> don't think I can help land this fix, but I won't get in anyone's way,
> either. Maybe just re-submit the patches with an improved commit
> message / cover letter that helps collect the info from the previous
> thread, the core issue, etc.

I poked at it some more in the past days. Paolo was right in 2019, this
issue needs to be debugged more to really understand why fiddling
with one PCI devices breaks another, apparently unrelated PCI device.

Once I know more, I will suggest a new change. The old one is
stale, and needs to be rebased anyway.


Olaf


pgpM_EdBiS24y.pgp
Description: Digitale Signatur von OpenPGP

[PULL 42/80] tcg/mips: Use atom_and_align_for_opc

2023-05-16 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/mips/tcg-target.c.inc | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/tcg/mips/tcg-target.c.inc b/tcg/mips/tcg-target.c.inc
index cd0254a0d7..3f3fe5b991 100644
--- a/tcg/mips/tcg-target.c.inc
+++ b/tcg/mips/tcg-target.c.inc
@@ -1138,7 +1138,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 
 typedef struct {
 TCGReg base;
-MemOp align;
+TCGAtomAlign aa;
 } HostAddress;
 
 bool tcg_target_has_memory_bswap(MemOp memop)
@@ -1158,11 +1158,15 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 {
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
-unsigned a_bits = get_alignment_bits(opc);
+MemOp a_bits;
 unsigned s_bits = opc & MO_SIZE;
-unsigned a_mask = (1 << a_bits) - 1;
+unsigned a_mask;
 TCGReg base;
 
+h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
+a_bits = h->aa.align;
+a_mask = (1 << a_bits) - 1;
+
 #ifdef CONFIG_SOFTMMU
 unsigned s_mask = (1 << s_bits) - 1;
 int mem_index = get_mmuidx(oi);
@@ -1281,7 +1285,6 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 #endif
 
 h->base = base;
-h->align = a_bits;
 return ldst;
 }
 
@@ -1394,7 +1397,7 @@ static void tcg_out_qemu_ld(TCGContext *s, TCGReg datalo, 
TCGReg datahi,
 
 ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, true);
 
-if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
+if (use_mips32r6_instructions || h.aa.align >= (opc & MO_SIZE)) {
 tcg_out_qemu_ld_direct(s, datalo, datahi, h.base, opc, data_type);
 } else {
 tcg_out_qemu_ld_unalign(s, datalo, datahi, h.base, opc, data_type);
@@ -1481,7 +1484,7 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg datalo, 
TCGReg datahi,
 
 ldst = prepare_host_addr(s, &h, addrlo, addrhi, oi, false);
 
-if (use_mips32r6_instructions || h.align >= (opc & MO_SIZE)) {
+if (use_mips32r6_instructions || h.aa.align >= (opc & MO_SIZE)) {
 tcg_out_qemu_st_direct(s, datalo, datahi, h.base, opc);
 } else {
 tcg_out_qemu_st_unalign(s, datalo, datahi, h.base, opc);
-- 
2.34.1

[PULL 09/80] meson: Detect atomic128 support with optimization

2023-05-16 Thread Richard Henderson

There is an edge condition prior to gcc13 for which optimization
is required to generate 16-byte atomic sequences.  Detect this.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 meson.build| 52 ++
 accel/tcg/ldst_atomicity.c.inc | 29 ---
 2 files changed, 59 insertions(+), 22 deletions(-)

diff --git a/meson.build b/meson.build
index d3cf48960b..61de8199cb 100644
--- a/meson.build
+++ b/meson.build
@@ -2249,23 +2249,21 @@ config_host_data.set('HAVE_BROKEN_SIZE_MAX', not 
cc.compiles('''
 return printf("%zu", SIZE_MAX);
 }''', args: ['-Werror']))
 
-atomic_test = '''
+# See if 64-bit atomic operations are supported.
+# Note that without __atomic builtins, we can only
+# assume atomic loads/stores max at pointer size.
+config_host_data.set('CONFIG_ATOMIC64', cc.links('''
   #include 
   int main(void)
   {
-@0@ x = 0, y = 0;
+uint64_t x = 0, y = 0;
 y = __atomic_load_n(&x, __ATOMIC_RELAXED);
 __atomic_store_n(&x, y, __ATOMIC_RELAXED);
 __atomic_compare_exchange_n(&x, &y, x, 0, __ATOMIC_RELAXED, 
__ATOMIC_RELAXED);
 __atomic_exchange_n(&x, y, __ATOMIC_RELAXED);
 __atomic_fetch_add(&x, y, __ATOMIC_RELAXED);
 return 0;
-  }'''
-
-# See if 64-bit atomic operations are supported.
-# Note that without __atomic builtins, we can only
-# assume atomic loads/stores max at pointer size.
-config_host_data.set('CONFIG_ATOMIC64', 
cc.links(atomic_test.format('uint64_t')))
+  }'''))
 
 has_int128 = cc.links('''
   __int128_t a;
@@ -2283,21 +2281,39 @@ if has_int128
   # "do we have 128-bit atomics which are handled inline and specifically not
   # via libatomic". The reason we can't use libatomic is documented in the
   # comment starting "GCC is a house divided" in include/qemu/atomic128.h.
-  has_atomic128 = cc.links(atomic_test.format('unsigned __int128'))
+  # We only care about these operations on 16-byte aligned pointers, so
+  # force 16-byte alignment of the pointer, which may be greater than
+  # __alignof(unsigned __int128) for the host.
+  atomic_test_128 = '''
+int main(int ac, char **av) {
+  unsigned __int128 *p = __builtin_assume_aligned(av[ac - 1], sizeof(16));
+  p[1] = __atomic_load_n(&p[0], __ATOMIC_RELAXED);
+  __atomic_store_n(&p[2], p[3], __ATOMIC_RELAXED);
+  __atomic_compare_exchange_n(&p[4], &p[5], p[6], 0, __ATOMIC_RELAXED, 
__ATOMIC_RELAXED);
+  return 0;
+}'''
+  has_atomic128 = cc.links(atomic_test_128)
 
   config_host_data.set('CONFIG_ATOMIC128', has_atomic128)
 
   if not has_atomic128
-has_cmpxchg128 = cc.links('''
-  int main(void)
-  {
-unsigned __int128 x = 0, y = 0;
-__sync_val_compare_and_swap_16(&x, y, x);
-return 0;
-  }
-''')
+# Even with __builtin_assume_aligned, the above test may have failed
+# without optimization enabled.  Try again with optimizations locally
+# enabled for the function.  See
+#   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107389
+has_atomic128_opt = cc.links('__attribute__((optimize("O1")))' + 
atomic_test_128)
+config_host_data.set('CONFIG_ATOMIC128_OPT', has_atomic128_opt)
 
-config_host_data.set('CONFIG_CMPXCHG128', has_cmpxchg128)
+if not has_atomic128_opt
+  config_host_data.set('CONFIG_CMPXCHG128', cc.links('''
+int main(void)
+{
+  unsigned __int128 x = 0, y = 0;
+  __sync_val_compare_and_swap_16(&x, y, x);
+  return 0;
+}
+  '''))
+endif
   endif
 endif
 
diff --git a/accel/tcg/ldst_atomicity.c.inc b/accel/tcg/ldst_atomicity.c.inc
index ce73b32def..ba5db7c366 100644
--- a/accel/tcg/ldst_atomicity.c.inc
+++ b/accel/tcg/ldst_atomicity.c.inc
@@ -16,6 +16,23 @@
 #endif
 #define HAVE_al8_fast  (ATOMIC_REG_SIZE >= 8)
 
+/*
+ * If __alignof(unsigned __int128) < 16, GCC may refuse to inline atomics
+ * that are supported by the host, e.g. s390x.  We can force the pointer to
+ * have our known alignment with __builtin_assume_aligned, however prior to
+ * GCC 13 that was only reliable with optimization enabled.  See
+ *   https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107389
+ */
+#if defined(CONFIG_ATOMIC128_OPT)
+# if !defined(__OPTIMIZE__)
+#  define ATTRIBUTE_ATOMIC128_OPT  __attribute__((optimize("O1")))
+# endif
+# define CONFIG_ATOMIC128
+#endif
+#ifndef ATTRIBUTE_ATOMIC128_OPT
+# define ATTRIBUTE_ATOMIC128_OPT
+#endif
+
 #if defined(CONFIG_ATOMIC128)
 # define HAVE_al16_fasttrue
 #else
@@ -152,7 +169,8 @@ static inline uint64_t load_atomic8(void *pv)
  *
  * Atomically load 16 aligned bytes from @pv.
  */
-static inline Int128 load_atomic16(void *pv)
+static inline Int128 ATTRIBUTE_ATOMIC128_OPT
+load_atomic16(void *pv)
 {
 #ifdef CONFIG_ATOMIC128
 __uint128_t *p = __builtin_assume_aligned(pv, 16);
@@ -356,7 +374,8 @@ static uint64_t load_atom_extract_al16_or_exit(CPUArchState 
*env, uintptr_t ra,
  * cross an 16-byte boundary then the access must be 16

[PULL 46/80] tcg/sparc64: Use atom_and_align_for_opc

2023-05-16 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/sparc64/tcg-target.c.inc | 21 -
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/tcg/sparc64/tcg-target.c.inc b/tcg/sparc64/tcg-target.c.inc
index bb23038529..9676b745a2 100644
--- a/tcg/sparc64/tcg-target.c.inc
+++ b/tcg/sparc64/tcg-target.c.inc
@@ -1009,6 +1009,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 typedef struct {
 TCGReg base;
 TCGReg index;
+TCGAtomAlign aa;
 } HostAddress;
 
 bool tcg_target_has_memory_bswap(MemOp memop)
@@ -1028,13 +1029,13 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 {
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
-unsigned a_bits = get_alignment_bits(opc);
-unsigned s_bits = opc & MO_SIZE;
+MemOp s_bits = opc & MO_SIZE;
 unsigned a_mask;
 
 /* We don't support unaligned accesses. */
-a_bits = MAX(a_bits, s_bits);
-a_mask = (1u << a_bits) - 1;
+h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
+h->aa.align = MAX(h->aa.align, s_bits);
+a_mask = (1u << h->aa.align) - 1;
 
 #ifdef CONFIG_SOFTMMU
 int mem_index = get_mmuidx(oi);
@@ -1086,11 +1087,13 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 cc = TARGET_LONG_BITS == 64 ? BPCC_XCC : BPCC_ICC;
 tcg_out_bpcc0(s, COND_NE, BPCC_PN | cc, 0);
 #else
-if (a_bits != s_bits) {
-/*
- * Test for at least natural alignment, and defer
- * everything else to the helper functions.
- */
+/*
+ * If the size equals the required alignment, we can skip the test
+ * and allow host SIGBUS to deliver SIGBUS to the guest.
+ * Otherwise, test for at least natural alignment and defer
+ * everything else to the helper functions.
+ */
+if (s_bits != get_alignment_bits(opc)) {
 tcg_debug_assert(check_fit_tl(a_mask, 13));
 tcg_out_arithi(s, TCG_REG_G0, addr_reg, a_mask, ARITH_ANDCC);
 
-- 
2.34.1

[PULL 74/80] tcg/aarch64: Remove TARGET_LONG_BITS, TCG_TYPE_TL

2023-05-16 Thread Richard Henderson

All uses replaced with TCGContext.addr_type.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.c.inc | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 8d78838796..41838f8170 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -1661,7 +1661,7 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
TCGReg addr_reg, MemOpIdx oi,
bool is_ld)
 {
-TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
+TCGType addr_type = s->addr_type;
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
 MemOp s_bits = opc & MO_SIZE;
@@ -1705,7 +1705,7 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 tcg_out_insn(s, 3502, ADD, 1, TCG_REG_X1, TCG_REG_X1, TCG_REG_X0);
 
 /* Load the tlb comparator into X0, and the fast path addend into X1.  */
-tcg_out_ld(s, TCG_TYPE_TL, TCG_REG_X0, TCG_REG_X1,
+tcg_out_ld(s, addr_type, TCG_REG_X0, TCG_REG_X1,
is_ld ? offsetof(CPUTLBEntry, addr_read)
  : offsetof(CPUTLBEntry, addr_write));
 tcg_out_ld(s, TCG_TYPE_PTR, TCG_REG_X1, TCG_REG_X1,
@@ -1719,18 +1719,17 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 if (a_mask >= s_mask) {
 x3 = addr_reg;
 } else {
-tcg_out_insn(s, 3401, ADDI, TARGET_LONG_BITS == 64,
+tcg_out_insn(s, 3401, ADDI, addr_type,
  TCG_REG_X3, addr_reg, s_mask - a_mask);
 x3 = TCG_REG_X3;
 }
 compare_mask = (uint64_t)TARGET_PAGE_MASK | a_mask;
 
 /* Store the page mask part of the address into X3.  */
-tcg_out_logicali(s, I3404_ANDI, TARGET_LONG_BITS == 64,
- TCG_REG_X3, x3, compare_mask);
+tcg_out_logicali(s, I3404_ANDI, addr_type, TCG_REG_X3, x3, compare_mask);
 
 /* Perform the address comparison. */
-tcg_out_cmp(s, TARGET_LONG_BITS == 64, TCG_REG_X0, TCG_REG_X3, 0);
+tcg_out_cmp(s, addr_type, TCG_REG_X0, TCG_REG_X3, 0);
 
 /* If not equal, we jump to the slow path. */
 ldst->label_ptr[0] = s->code_ptr;
-- 
2.34.1

[PULL 50/80] tcg/aarch64: Support 128-bit load/store

2023-05-16 Thread Richard Henderson

Use LDXP+STXP when LSE2 is not present and 16-byte atomicity is required,
and LDP/STP otherwise.  This requires allocating a second general-purpose
temporary, as Rs cannot overlap Rn in STXP.

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target-con-set.h |   2 +
 tcg/aarch64/tcg-target.h |  11 +-
 tcg/aarch64/tcg-target.c.inc | 179 ++-
 3 files changed, 189 insertions(+), 3 deletions(-)

diff --git a/tcg/aarch64/tcg-target-con-set.h b/tcg/aarch64/tcg-target-con-set.h
index d6c6866878..74065c7098 100644
--- a/tcg/aarch64/tcg-target-con-set.h
+++ b/tcg/aarch64/tcg-target-con-set.h
@@ -14,6 +14,7 @@ C_O0_I2(lZ, l)
 C_O0_I2(r, rA)
 C_O0_I2(rZ, r)
 C_O0_I2(w, r)
+C_O0_I3(lZ, lZ, l)
 C_O1_I1(r, l)
 C_O1_I1(r, r)
 C_O1_I1(w, r)
@@ -33,4 +34,5 @@ C_O1_I2(w, w, wO)
 C_O1_I2(w, w, wZ)
 C_O1_I3(w, w, w, w)
 C_O1_I4(r, r, rA, rZ, rZ)
+C_O2_I1(r, r, l)
 C_O2_I4(r, r, rZ, rZ, rA, rMZ)
diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index 74ee2ed255..2c079f21c2 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -129,7 +129,16 @@ extern bool have_lse2;
 #define TCG_TARGET_HAS_muluh_i641
 #define TCG_TARGET_HAS_mulsh_i641
 
-#define TCG_TARGET_HAS_qemu_ldst_i128   0
+/*
+ * Without FEAT_LSE2, we must use LDXP+STXP to implement atomic 128-bit load,
+ * which requires writable pages.  We must defer to the helper for user-only,
+ * but in system mode all ram is writable for the host.
+ */
+#ifdef CONFIG_USER_ONLY
+#define TCG_TARGET_HAS_qemu_ldst_i128   have_lse2
+#else
+#define TCG_TARGET_HAS_qemu_ldst_i128   1
+#endif
 
 #define TCG_TARGET_HAS_v64  1
 #define TCG_TARGET_HAS_v128 1
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index 1ed5be2c00..893b3514bb 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -81,6 +81,7 @@ bool have_lse;
 bool have_lse2;
 
 #define TCG_REG_TMP0 TCG_REG_X30
+#define TCG_REG_TMP1 TCG_REG_X17
 #define TCG_VEC_TMP0 TCG_REG_V31
 
 #ifndef CONFIG_SOFTMMU
@@ -404,6 +405,10 @@ typedef enum {
 I3305_LDR_v64   = 0x5c00,
 I3305_LDR_v128  = 0x9c00,
 
+/* Load/store exclusive. */
+I3306_LDXP  = 0xc860,
+I3306_STXP  = 0xc820,
+
 /* Load/store register.  Described here as 3.3.12, but the helper
that emits them can transform to 3.3.10 or 3.3.13.  */
 I3312_STRB  = 0x3800 | LDST_ST << 22 | MO_8 << 30,
@@ -468,6 +473,9 @@ typedef enum {
 I3406_ADR   = 0x1000,
 I3406_ADRP  = 0x9000,
 
+/* Add/subtract extended register instructions. */
+I3501_ADD   = 0x0b20,
+
 /* Add/subtract shifted register instructions (without a shift).  */
 I3502_ADD   = 0x0b00,
 I3502_ADDS  = 0x2b00,
@@ -638,6 +646,12 @@ static void tcg_out_insn_3305(TCGContext *s, AArch64Insn 
insn,
 tcg_out32(s, insn | (imm19 & 0x7) << 5 | rt);
 }
 
+static void tcg_out_insn_3306(TCGContext *s, AArch64Insn insn, TCGReg rs,
+  TCGReg rt, TCGReg rt2, TCGReg rn)
+{
+tcg_out32(s, insn | rs << 16 | rt2 << 10 | rn << 5 | rt);
+}
+
 static void tcg_out_insn_3201(TCGContext *s, AArch64Insn insn, TCGType ext,
   TCGReg rt, int imm19)
 {
@@ -720,6 +734,14 @@ static void tcg_out_insn_3406(TCGContext *s, AArch64Insn 
insn,
 tcg_out32(s, insn | (disp & 3) << 29 | (disp & 0x1c) << (5 - 2) | rd);
 }
 
+static inline void tcg_out_insn_3501(TCGContext *s, AArch64Insn insn,
+ TCGType sf, TCGReg rd, TCGReg rn,
+ TCGReg rm, int opt, int imm3)
+{
+tcg_out32(s, insn | sf << 31 | rm << 16 | opt << 13 |
+  imm3 << 10 | rn << 5 | rd);
+}
+
 /* This function is for both 3.5.2 (Add/Subtract shifted register), for
the rare occasion when we actually want to supply a shift amount.  */
 static inline void tcg_out_insn_3502S(TCGContext *s, AArch64Insn insn,
@@ -1647,16 +1669,16 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 TCGType addr_type = TARGET_LONG_BITS == 64 ? TCG_TYPE_I64 : TCG_TYPE_I32;
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
+MemOp s_bits = opc & MO_SIZE;
 unsigned a_mask;
 
 h->aa = atom_and_align_for_opc(s, opc,
have_lse2 ? MO_ATOM_WITHIN16
  : MO_ATOM_IFALIGN,
-   false);
+   s_bits == MO_128);
 a_mask = (1 << h->aa.align) - 1;
 
 #ifdef CONFIG_SOFTMMU
-unsigned s_bits = opc & MO_SIZE;
 unsigned s_mask = (1u << s_bits) - 1;
 unsigned mem_index = get_mmuidx(oi);
 TCGReg x3;
@@ -1837,6 +1859,148 @@ static void tcg_out_qemu_st(TCGContext *s, TCGReg 
data_reg, TCGReg addr_reg,
 }
 }
 
+static TCGLabelQemuLdst *
+prepare_host_addr_

[PULL 43/80] tcg/ppc: Use atom_and_align_for_opc

2023-05-16 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/ppc/tcg-target.c.inc | 19 ++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index b62a163014..b5c49895f3 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2015,6 +2015,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *lb)
 typedef struct {
 TCGReg base;
 TCGReg index;
+TCGAtomAlign aa;
 } HostAddress;
 
 bool tcg_target_has_memory_bswap(MemOp memop)
@@ -2034,7 +2035,23 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 {
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
-unsigned a_bits = get_alignment_bits(opc);
+MemOp a_bits;
+
+/*
+ * Book II, Section 1.4, Single-Copy Atomicity, specifies:
+ *
+ * Before 3.0, "An access that is not atomic is performed as a set of
+ * smaller disjoint atomic accesses. In general, the number and alignment
+ * of these accesses are implementation-dependent."  Thus MO_ATOM_IFALIGN.
+ *
+ * As of 3.0, "the non-atomic access is performed as described in
+ * the corresponding list", which matches MO_ATOM_SUBALIGN.
+ */
+h->aa = atom_and_align_for_opc(s, opc,
+   have_isa_3_00 ? MO_ATOM_SUBALIGN
+ : MO_ATOM_IFALIGN,
+   false);
+a_bits = h->aa.align;
 
 #ifdef CONFIG_SOFTMMU
 int mem_index = get_mmuidx(oi);
-- 
2.34.1

[PULL 71/80] tcg/i386: Remove TARGET_LONG_BITS, TCG_TYPE_TL

2023-05-16 Thread Richard Henderson

All uses can be infered from the INDEX_op_qemu_*_a{32,64}_* opcode
being used.  Add a field into TCGLabelQemuLdst to record the usage.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.c.inc | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index 653e3e10a8..e173853dc4 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1975,10 +1975,8 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext 
*s, HostAddress *h,
 ldst->addrhi_reg = addrhi;
 
 if (TCG_TARGET_REG_BITS == 64) {
-if (TARGET_LONG_BITS == 64) {
-ttype = TCG_TYPE_I64;
-trexw = P_REXW;
-}
+ttype = s->addr_type;
+trexw = (ttype == TCG_TYPE_I32 ? 0 : P_REXW);
 if (TCG_TYPE_PTR == TCG_TYPE_I64) {
 hrexw = P_REXW;
 if (TARGET_PAGE_BITS + CPU_TLB_DYN_MAX_BITS > 32) {
@@ -2023,7 +2021,7 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 ldst->label_ptr[0] = s->code_ptr;
 s->code_ptr += 4;
 
-if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) {
+if (TCG_TARGET_REG_BITS == 32 && s->addr_type == TCG_TYPE_I64) {
 /* cmp 4(TCG_REG_L0), addrhi */
 tcg_out_modrm_offset(s, OPC_CMP_GvEv, addrhi, TCG_REG_L0, cmp_ofs + 4);
 
-- 
2.34.1

[PULL 41/80] tcg/loongarch64: Use atom_and_align_for_opc

2023-05-16 Thread Richard Henderson

Reviewed-by: Peter Maydell 
Signed-off-by: Richard Henderson 
---
 tcg/loongarch64/tcg-target.c.inc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tcg/loongarch64/tcg-target.c.inc b/tcg/loongarch64/tcg-target.c.inc
index d26174dde5..07d35f92fa 100644
--- a/tcg/loongarch64/tcg-target.c.inc
+++ b/tcg/loongarch64/tcg-target.c.inc
@@ -826,6 +826,7 @@ static bool tcg_out_qemu_st_slow_path(TCGContext *s, 
TCGLabelQemuLdst *l)
 typedef struct {
 TCGReg base;
 TCGReg index;
+TCGAtomAlign aa;
 } HostAddress;
 
 bool tcg_target_has_memory_bswap(MemOp memop)
@@ -845,7 +846,10 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 {
 TCGLabelQemuLdst *ldst = NULL;
 MemOp opc = get_memop(oi);
-unsigned a_bits = get_alignment_bits(opc);
+MemOp a_bits;
+
+h->aa = atom_and_align_for_opc(s, opc, MO_ATOM_IFALIGN, false);
+a_bits = h->aa.align;
 
 #ifdef CONFIG_SOFTMMU
 unsigned s_bits = opc & MO_SIZE;
-- 
2.34.1

[PULL 11/80] tcg/aarch64: Detect have_lse, have_lse2 for linux

2023-05-16 Thread Richard Henderson

Notice when the host has additional atomic instructions.
The new variables will also be used in generated code.

Reviewed-by: Peter Maydell 
Reviewed-by: Philippe Mathieu-Daudé 
Signed-off-by: Richard Henderson 
---
 tcg/aarch64/tcg-target.h |  3 +++
 tcg/aarch64/tcg-target.c.inc | 12 
 2 files changed, 15 insertions(+)

diff --git a/tcg/aarch64/tcg-target.h b/tcg/aarch64/tcg-target.h
index c0b0f614ba..3c0b0d312d 100644
--- a/tcg/aarch64/tcg-target.h
+++ b/tcg/aarch64/tcg-target.h
@@ -57,6 +57,9 @@ typedef enum {
 #define TCG_TARGET_CALL_ARG_I128TCG_CALL_ARG_EVEN
 #define TCG_TARGET_CALL_RET_I128TCG_CALL_RET_NORMAL
 
+extern bool have_lse;
+extern bool have_lse2;
+
 /* optional instructions */
 #define TCG_TARGET_HAS_div_i32  1
 #define TCG_TARGET_HAS_rem_i32  1
diff --git a/tcg/aarch64/tcg-target.c.inc b/tcg/aarch64/tcg-target.c.inc
index e6636c1f8b..fc551a3d10 100644
--- a/tcg/aarch64/tcg-target.c.inc
+++ b/tcg/aarch64/tcg-target.c.inc
@@ -13,6 +13,9 @@
 #include "../tcg-ldst.c.inc"
 #include "../tcg-pool.c.inc"
 #include "qemu/bitops.h"
+#ifdef __linux__
+#include 
+#endif
 
 /* We're going to re-use TCGType in setting of the SF bit, which controls
the size of the operation performed.  If we know the values match, it
@@ -71,6 +74,9 @@ static TCGReg tcg_target_call_oarg_reg(TCGCallReturnKind 
kind, int slot)
 return TCG_REG_X0 + slot;
 }
 
+bool have_lse;
+bool have_lse2;
+
 #define TCG_REG_TMP TCG_REG_X30
 #define TCG_VEC_TMP TCG_REG_V31
 
@@ -2899,6 +2905,12 @@ static TCGConstraintSetIndex tcg_target_op_def(TCGOpcode 
op)
 
 static void tcg_target_init(TCGContext *s)
 {
+#ifdef __linux__
+unsigned long hwcap = qemu_getauxval(AT_HWCAP);
+have_lse = hwcap & HWCAP_ATOMICS;
+have_lse2 = hwcap & HWCAP_USCAT;
+#endif
+
 tcg_target_available_regs[TCG_TYPE_I32] = 0xu;
 tcg_target_available_regs[TCG_TYPE_I64] = 0xu;
 tcg_target_available_regs[TCG_TYPE_V64] = 0xull;
-- 
2.34.1

[PULL 70/80] tcg/i386: Adjust type of tlb_mask

2023-05-16 Thread Richard Henderson

Because of its use on tgen_arithi, this value must be a signed
32-bit quantity, as that is what may be encoded in the insn.
The truncation of the value to unsigned for 32-bit guests is
done via the REX bit via 'trexw'.

Removes the only uses of target_ulong from this tcg backend.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 tcg/i386/tcg-target.c.inc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tcg/i386/tcg-target.c.inc b/tcg/i386/tcg-target.c.inc
index aed5bbd94c..653e3e10a8 100644
--- a/tcg/i386/tcg-target.c.inc
+++ b/tcg/i386/tcg-target.c.inc
@@ -1966,7 +1966,7 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 int trexw = 0, hrexw = 0, tlbrexw = 0;
 unsigned mem_index = get_mmuidx(oi);
 unsigned s_mask = (1 << s_bits) - 1;
-target_ulong tlb_mask;
+int tlb_mask;
 
 ldst = new_ldst_label(s);
 ldst->is_ld = is_ld;
@@ -2011,7 +2011,7 @@ static TCGLabelQemuLdst *prepare_host_addr(TCGContext *s, 
HostAddress *h,
 tcg_out_modrm_offset(s, OPC_LEA + trexw, TCG_REG_L1,
  addrlo, s_mask - a_mask);
 }
-tlb_mask = (target_ulong)TARGET_PAGE_MASK | a_mask;
+tlb_mask = TARGET_PAGE_MASK | a_mask;
 tgen_arithi(s, ARITH_AND + trexw, TCG_REG_L1, tlb_mask, 0);
 
 /* cmp 0(TCG_REG_L0), TCG_REG_L1 */
-- 
2.34.1

1 2 3 4 >

1 - 100 of 376 matches

Mail list logo