date:20220920

Re: [PATCH] hw/virtio/vhost-user: support obtain vdpa device's mac address automatically

2022-09-20 Thread Raphael Norwitz

I have some concerns from the vhost-user-blk side.



>On Tue, Sep 13, 2022 at 5:13 PM Hao Chen  wrote:

>>

>> When use dpdk-vdpa tests vdpa device. You need to specify the mac address to

>> start the virtual machine through libvirt or qemu, but now, the libvirt or

>> qemu can call dpdk vdpa vendor driver's ops .get_config through 
>> vhost_net_get_config

>> to get the mac address of the vdpa hardware without manual configuration.

>>

>> Signed-off-by: Hao Chen 

>

>Adding Cindy for comments.

>

>Thanks

>

>> ---

>>  hw/block/vhost-user-blk.c |  1 -

>>  hw/net/virtio-net.c   |  3 ++-

>>  hw/virtio/vhost-user.c| 19 ---

>>  3 files changed, 2 insertions(+), 21 deletions(-)

>>

>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c

>> index 9117222456..5dca4eab09 100644

>> --- a/hw/block/vhost-user-blk.c

>> +++ b/hw/block/vhost-user-blk.c

>> @@ -337,7 +337,6 @@ static int vhost_user_blk_connect(DeviceState *dev, 
>> Error **errp)

>>

>>  vhost_dev_set_config_notifier(>dev, _ops);

>>

>> -s->vhost_user.supports_config = true;



vhost-user-blk requires the backend to support VHOST_USER_PROTOCOL_F_CONFIG

and vhost_user.supports_config is used to enforce that.



Why are you removing it here?





>>  ret = vhost_dev_init(>dev, >vhost_user, VHOST_BACKEND_TYPE_USER, 
>> 0,

>>   errp);

>>  if (ret < 0) {

>> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c

>> index dd0d056fde..274ea84644 100644

>> --- a/hw/net/virtio-net.c

>> +++ b/hw/net/virtio-net.c

>> @@ -149,7 +149,8 @@ static void virtio_net_get_config(VirtIODevice *vdev, 
>> uint8_t *config)

>>   * Is this VDPA? No peer means not VDPA: there's no way to

>>   * disconnect/reconnect a VDPA peer.

>>   */

>> -if (nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) {

>> +if ((nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_VDPA) 
>> ||

>> +(nc->peer && nc->peer->info->type == NET_CLIENT_DRIVER_VHOST_USER)) 
>> {

>>  ret = vhost_net_get_config(get_vhost_net(nc->peer), (uint8_t 
>> *),

>> n->config_size);

>>  if (ret != -1) {

>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c

>> index bd24741be8..8b01078249 100644

>> --- a/hw/virtio/vhost-user.c

>> +++ b/hw/virtio/vhost-user.c

>> @@ -2013,8 +2013,6 @@ static int vhost_user_backend_init(struct vhost_dev 
>> *dev, void *opaque,

>>  }

>>



Why are you removing this? Can you expand on how it helps dpdk-vdpa.



>>  if (virtio_has_feature(features, VHOST_USER_F_PROTOCOL_FEATURES)) {

>> -bool supports_f_config = vus->supports_config ||

>> -(dev->config_ops && dev->config_ops->vhost_dev_config_notifier);

>>  uint64_t protocol_features;

>>

>>  dev->backend_features |= 1ULL << VHOST_USER_F_PROTOCOL_FEATURES;

>> @@ -2033,23 +2031,6 @@ static int vhost_user_backend_init(struct vhost_dev 
>> *dev, void *opaque,

>>   */

>>  protocol_features &= VHOST_USER_PROTOCOL_FEATURE_MASK;

>>

>> -if (supports_f_config) {

>> -if (!virtio_has_feature(protocol_features,

>> -VHOST_USER_PROTOCOL_F_CONFIG)) {

>> -error_setg(errp, "vhost-user device expecting "

>> -   "VHOST_USER_PROTOCOL_F_CONFIG but the vhost-user 
>> backend does "

>> -   "not support it.");

>> -return -EPROTO;

>> -}

>> -} else {

>> -if (virtio_has_feature(protocol_features,

>> -   VHOST_USER_PROTOCOL_F_CONFIG)) {

>> -warn_reportf_err(*errp, "vhost-user backend supports "

>> - "VHOST_USER_PROTOCOL_F_CONFIG but QEMU 
>> does not.");

>> -protocol_features &= ~(1ULL << 
>> VHOST_USER_PROTOCOL_F_CONFIG);

>> -}

>> -}

>> -

>>  /* final set of protocol features */

>>  dev->protocol_features = protocol_features;

>>  err = vhost_user_set_protocol_features(dev, dev->protocol_features);

>> --

>> 2.27.0

>>

>

Re: [PATCH 1/9] hw/riscv/sifive_e: Fix inheritance of SiFiveEState

2022-09-20 Thread Markus Armbruster

Bernhard Beschow  writes:

> Am 20. September 2022 11:36:47 UTC schrieb Markus Armbruster 
> :
>>Alistair Francis  writes:
>>
>>> On Tue, Sep 20, 2022 at 9:18 AM Bernhard Beschow  wrote:

 SiFiveEState inherits from SysBusDevice while it's TypeInfo claims it to
 inherit from TYPE_MACHINE. This is an inconsistency which can cause
 undefined behavior such as memory corruption.

 Change SiFiveEState to inherit from MachineState since it is registered
 as a machine.

 Signed-off-by: Bernhard Beschow 
>>>
>>> Reviewed-by: Alistair Francis 
>>
>>To the SiFive maintainers: since this is a bug fix, let's merge it right
>>away.
>
> I could repost this particular patch with the three new tags (incl. Fixes) if 
> desired.

Can't hurt, and could help the maintainers.

Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-20 Thread Markus Armbruster

Wang Liang  writes:

> On Tue, 2022-09-20 at 13:18 +, Alberto Garcia wrote:
>> On Tue 20 Sep 2022 08:33:50 PM +08, wanglian...@126.com wrote:
>> > From: Wang Liang 
>> > 
>> > The delay time should never be a negative value.
>> > 
>> > -return limit->slice_end_time - now;
>> > +return MAX(limit->slice_end_time - now, 0);
>> 
>> How can this be negative? slice_end_time is guaranteed to be larger
>> than
>> now:
>> 
>> if (limit->slice_end_time < now) {
>> /* Previous, possibly extended, time slice finished; reset
>> the
>>  * accounting. */
>> limit->slice_start_time = now;
>> limit->slice_end_time = now + limit->slice_ns;
>> limit->dispatched = 0;
>> }
>> 
> This is just a guarantee. 

Smells like an invariant to me.

> If slice_end_time is assigned later by
> limit->slice_end_time = limit->slice_start_time +
> (uint64_t)(delay_slices * limit->slice_ns);
> There may be precision issues at that time.

What are the issues exactly?  What misbehavior are you observing?

Your commit message should show how delay time can become negative, and
why that's bad.

Re: [PATCH v9 3/7] block: add block layer APIs resembling Linux ZonedBlockDevice ioctls

2022-09-20 Thread Damien Le Moal


On 9/20/22 17:51, Klaus Jensen wrote:

On Sep 10 13:27, Sam Li wrote:

Add a new zoned_host_device BlockDriver. The zoned_host_device option
accepts only zoned host block devices. By adding zone management
operations in this new BlockDriver, users can use the new block
layer APIs including Report Zone and four zone management operations
(open, close, finish, reset).

Qemu-io uses the new APIs to perform zoned storage commands of the device:
zone_report(zrp), zone_open(zo), zone_close(zc), zone_reset(zrs),
zone_finish(zf).

For example, to test zone_report, use following command:
$ ./build/qemu-io --image-opts -n driver=zoned_host_device, filename=/dev/nullb0
-c "zrp offset nr_zones"

Signed-off-by: Sam Li 
Reviewed-by: Hannes Reinecke 
---
  block/block-backend.c | 145 ++
  block/file-posix.c| 323 +-
  block/io.c|  41 
  include/block/block-io.h  |   7 +
  include/block/block_int-common.h  |  21 ++
  include/block/raw-aio.h   |   6 +-
  include/sysemu/block-backend-io.h |  17 ++
  meson.build   |   1 +
  qapi/block-core.json  |   8 +-
  qemu-io-cmds.c| 143 +
  10 files changed, 708 insertions(+), 4 deletions(-)

+/*
+ * zone management operations - Execute an operation on a zone
+ */
+static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op,
+int64_t offset, int64_t len) {
+#if defined(CONFIG_BLKZONED)
+BDRVRawState *s = bs->opaque;
+RawPosixAIOData acb;
+int64_t zone_sector, zone_sector_mask;
+const char *zone_op_name;
+unsigned long zone_op;
+bool is_all = false;
+
+zone_sector = bs->bl.zone_sectors;
+zone_sector_mask = zone_sector - 1;
+if (offset & zone_sector_mask) {
+error_report("sector offset %" PRId64 " is not aligned to zone size "
+ "%" PRId64 "", offset, zone_sector);
+return -EINVAL;
+}
+
+if (len & zone_sector_mask) {
+error_report("number of sectors %" PRId64 " is not aligned to zone 
size"
+  " %" PRId64 "", len, zone_sector);
+return -EINVAL;
+}


These checks impose a power-of-two constraint on the zone size. Can they
be changed to divisions to lift that constraint? I don't see anything in
this patch set that relies on power of two zone sizes.


Given that Linux will only expose zoned devices that have a zone size 
that is a power of 2 number of LBAs, this will work as is and avoid 
divisions in the IO path. But given that zone management operations are 
not performance critical, generalizing the code should be fine.


However, once we start adding the code for full zone emulation on top of 
a regular file or qcow image, sector-to-zone conversions requiring 
divisions will hurt. So I really would prefer the code be left as-is for 
now.



--
Damien Le Moal
Western Digital Research

Re: [PATCH v3 2/3] module: add Error arguments to module_load_one and module_load_qom_one

2022-09-20 Thread Markus Armbruster

Kevin Wolf  writes:

> Am 08.09.2022 um 19:36 hat Claudio Fontana geschrieben:
>> On 9/8/22 19:10, Claudio Fontana wrote:
>> > On 9/8/22 18:03, Richard Henderson wrote:
>> >> On 9/8/22 15:53, Claudio Fontana wrote:
>> >>> @@ -446,8 +447,13 @@ static int dmg_open(BlockDriverState *bs, QDict 
>> >>> *options, int flags,
>> >>>   return -EINVAL;
>> >>>   }
>> >>>   
>> >>> -block_module_load_one("dmg-bz2");
>> >>> -block_module_load_one("dmg-lzfse");
>> >>> +if (!block_module_load_one("dmg-bz2", _err) && local_err) {
>> >>> +error_report_err(local_err);
>> >>> +}
>> >>> +local_err = NULL;
>> >>> +if (!block_module_load_one("dmg-lzfse", _err) && local_err) {
>> >>> +error_report_err(local_err);
>> >>> +}
>> >>>   
>> >>>   s->n_chunks = 0;
>> >>>   s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
>> >>
>> >> I wonder if these shouldn't fail hard if the modules don't exist?
>> >> Or at least pass back the error.
>> >>
>> >> Kevin?
>> 
>> is "dmg-bz" _required_ for dmg open to work? I suspect if the dmg
>> image is not compressed, "dmg" can function even if the extra dmg-bz
>> module is not loaded right?
>
> Indeed. The code seems to consider that the modules may not be present.
> The behaviour in these cases is questionable (it seems to silently leave
> the buffers as they are and return success), but the modules are clearly
> optional.
>
>> I'd suspect we should then do:
>> 
>> if (!block_module_load_one("dmg-bz2", _err)) {
>>   if (local_err) {
>>  error_report_err(local_err);
>>  return -EINVAL;
>>   }
>>   warn_report("dmg-bz2 is not present, dmg will skip bz2-compressed chunks */
>> }
>> 
>> and same for dmg-lzfse...?
>
> Actually, I think during initialisation, we should just pass NULL as
> errp and ignore any errors.
>
> When a request would access a block that can't be uncompressed because
> of the missing module, that's where we can have a warn_report_once() and
> arguably should fail the I/O request.

Seems like asking for data corruption.  To avoid it, the complete stack
needs to handle I/O errors correctly.

Can we detect presence of compressed blocks on open?

Re: [PATCH] virtio-net: set the max of queue size to 4096

2022-09-20 Thread Michael S. Tsirkin

On Wed, Sep 21, 2022 at 10:59:50AM +0800, Jason Wang wrote:
> On Tue, Sep 20, 2022 at 8:59 PM Michael S. Tsirkin  wrote:
> >
> > On Tue, Sep 20, 2022 at 10:03:23AM +0800, Jason Wang wrote:
> > > On Tue, Sep 20, 2022 at 9:38 AM Jason Wang  wrote:
> > > >
> > > > On Tue, Sep 20, 2022 at 9:10 AM liuhaiwei  wrote:
> > > > >
> > > > > From: liuhaiwei 
> > > > >
> > > > > the limit of maximum of rx_queue_size and tx_queue to 1024 is so 
> > > > > small as to affect our network performance when using the  virtio-net 
> > > > > and vhost ,
> > > > > we cannot set the maximum size beyond 1k.
> > > > > why not enlarge the maximum size (such as 4096) when using the vhost 
> > > > > backend?
> > > >
> > > > As Michael mentioned, there's a limitation in the kernel UIO_MAXIOV.
> > > > We need to find way to overcome that limit first.
> > >
> > > Btw, this probably means the skb needs to be built by vhost-net
> > > itself, instead of tuntap.
> > >
> > > Thanks
> >
> > That might help vhost-net but it won't help virtio-net.
> >
> > IMO the right fix is to add a separate limit on s/g size
> > to the spec. Block and scsi already have it, seems
> > reasonable to add it for others too.
> 
> I wonder if it would be simpler to tie the limit to the virtqueue size.

Simpler but wrong I think. As another example for RX we do not need s/g
at all with mergeable. At the same time deep queues are needed
to avoid underruns.



> Having individual limits seems to complicate the migration compatibility 
> anyhow.
> 
> Thanks

I don't see how it's materially different from queue size.
We need interfaces for that, that we lack now, userspace is
expected to guess a subset available on both ends.

> >
> >
> >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > Signed-off-by: liuhaiwei 
> > > > > Signed-off-by: liuhaiwei 
> > > > > ---
> > > > >  hw/net/virtio-net.c| 47 
> > > > > +++---
> > > > >  hw/virtio/virtio.c |  8 +--
> > > > >  include/hw/virtio/virtio.h |  1 +
> > > > >  3 files changed, 41 insertions(+), 15 deletions(-)
> > > > >
> > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > index dd0d056fde..4b56484855 100644
> > > > > --- a/hw/net/virtio-net.c
> > > > > +++ b/hw/net/virtio-net.c
> > > > > @@ -52,12 +52,11 @@
> > > > >  #define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
> > > > >
> > > > >  /* previously fixed value */
> > > > > -#define VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE 256
> > > > > -#define VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE 256
> > > > > +#define VIRTIO_NET_VHOST_USER_DEFAULT_SIZE 2048
> > > > >
> > > > >  /* for now, only allow larger queue_pairs; with virtio-1, guest can 
> > > > > downsize */
> > > > > -#define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
> > > > > -#define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
> > > > > +#define VIRTIO_NET_RX_QUEUE_MIN_SIZE 256
> > > > > +#define VIRTIO_NET_TX_QUEUE_MIN_SIZE 256
> > > > >
> > > > >  #define VIRTIO_NET_IP4_ADDR_SIZE   8/* ipv4 saddr + daddr */
> > > > >
> > > > > @@ -594,6 +593,28 @@ static int peer_has_ufo(VirtIONet *n)
> > > > >  return n->has_ufo;
> > > > >  }
> > > > >
> > > > > +static void virtio_net_set_default_queue_size(VirtIONet *n)
> > > > > +{
> > > > > +NetClientState *peer = n->nic_conf.peers.ncs[0];
> > > > > +
> > > > > +/* Default value is 0 if not set */
> > > > > +if (n->net_conf.rx_queue_size == 0) {
> > > > > +if (peer && peer->info->type == 
> > > > > NET_CLIENT_DRIVER_VHOST_USER) {
> > > > > +n->net_conf.rx_queue_size = 
> > > > > VIRTIO_NET_VHOST_USER_DEFAULT_SIZE;
> > > > > +} else {
> > > > > +n->net_conf.rx_queue_size = VIRTIO_NET_VQ_MAX_SIZE;
> > > > > +}
> > > > > +}
> > > > > +
> > > > > +if (n->net_conf.tx_queue_size == 0) {
> > > > > +if (peer && peer->info->type == 
> > > > > NET_CLIENT_DRIVER_VHOST_USER) {
> > > > > +n->net_conf.tx_queue_size = 
> > > > > VIRTIO_NET_VHOST_USER_DEFAULT_SIZE;
> > > > > +} else {
> > > > > +n->net_conf.tx_queue_size = VIRTIO_NET_VQ_MAX_SIZE;
> > > > > +}
> > > > > +}
> > > > > +}
> > > > > +
> > > > >  static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int 
> > > > > mergeable_rx_bufs,
> > > > > int version_1, int 
> > > > > hash_report)
> > > > >  {
> > > > > @@ -633,7 +654,7 @@ static int virtio_net_max_tx_queue_size(VirtIONet 
> > > > > *n)
> > > > >   * size.
> > > > >   */
> > > > >  if (!peer) {
> > > > > -return VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE;
> > > > > +return VIRTIO_NET_VQ_MAX_SIZE;
> > > > >  }
> > > > >
> > > > >  switch(peer->info->type) {
> > > > > @@ -641,7 +662,7 @@ static int virtio_net_max_tx_queue_size(VirtIONet 
> > > > > *n)
> > > > >  case NET_CLIENT_DRIVER_VHOST_VDPA:
> > > > >  return VIRTQUEUE_MAX_SIZE;
> > > > >  default:
> > > > > -

Re: [PATCH 2/3] acpi: arm/virt: build_gtdt: fix invalid 64-bit physical addresses

2022-09-20 Thread Ani Sinha




On Tue, 20 Sep 2022, Miguel Luis wrote:

> Per the ACPI 6.5 specification, on the GTDT Table Structure, the Counter 
> Control
> Block Address and Counter Read Block Address fields of the GTDT table should 
> be
> set to 0x if not provided, rather than 0x0.
>
> Fixes: 41041e57085 ("acpi: arm/virt: build_gtdt: use 
> acpi_table_begin()/acpi_table_end() instead of build_header()")
>
> Signed-off-by: Miguel Luis 

Reviewed-by: Ani Sinha 

> ---
>  hw/arm/virt-acpi-build.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
> index 9b3aee01bf..13c6e3e468 100644
> --- a/hw/arm/virt-acpi-build.c
> +++ b/hw/arm/virt-acpi-build.c
> @@ -592,8 +592,7 @@ build_gtdt(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  acpi_table_begin(, table_data);
>
>  /* CntControlBase Physical Address */
> -/* FIXME: invalid value, should be 0x if not impl. ? */
> -build_append_int_noprefix(table_data, 0, 8);
> +build_append_int_noprefix(table_data, 0x, 8);
>  build_append_int_noprefix(table_data, 0, 4); /* Reserved */
>  /*
>   * FIXME: clarify comment:
> @@ -618,7 +617,7 @@ build_gtdt(GArray *table_data, BIOSLinker *linker, 
> VirtMachineState *vms)
>  /* Non-Secure EL2 timer Flags */
>  build_append_int_noprefix(table_data, irqflags, 4);
>  /* CntReadBase Physical address */
> -build_append_int_noprefix(table_data, 0, 8);
> +build_append_int_noprefix(table_data, 0x, 8);
>  /* Platform Timer Count */
>  build_append_int_noprefix(table_data, 0, 4);
>  /* Platform Timer Offset */
> --
> 2.36.0
>
>

Re: [PATCH 3/3] tests/acpi: virt: update ACPI GTDT binaries

2022-09-20 Thread Ani Sinha




On Tue, 20 Sep 2022, Miguel Luis wrote:

> Step 6 & 7 of the bios-tables-test.c documented procedure.
>
> Differences between disassembled ASL files for GTDT:
>
> @@ -13,14 +13,14 @@
>  [000h    4]Signature : "GTDT"[Generic Timer 
> Description Table]
>  [004h 0004   4] Table Length : 0060
>  [008h 0008   1] Revision : 02
> -[009h 0009   1] Checksum : 8C
> +[009h 0009   1] Checksum : 9C
>  [00Ah 0010   6]   Oem ID : "BOCHS "
>  [010h 0016   8] Oem Table ID : "BXPC"
>  [018h 0024   4] Oem Revision : 0001
>  [01Ch 0028   4]  Asl Compiler ID : "BXPC"
>  [020h 0032   4]Asl Compiler Revision : 0001
>
> -[024h 0036   8]Counter Block Address : 
> +[024h 0036   8]Counter Block Address : 
>  [02Ch 0044   4] Reserved : 
>
>  [030h 0048   4] Secure EL1 Interrupt : 001D
> @@ -46,16 +46,16 @@
>  Trigger Mode : 0
>  Polarity : 0
> Always On : 0
> -[050h 0080   8]   Counter Read Block Address : 
> +[050h 0080   8]   Counter Read Block Address : 
>
>  [058h 0088   4] Platform Timer Count : 
>  [05Ch 0092   4]Platform Timer Offset : 
>
>  Raw Table Data: Length 96 (0x60)
>
> -: 47 54 44 54 60 00 00 00 02 8C 42 4F 43 48 53 20  // 
> GTDT`.BOCHS
> +: 47 54 44 54 60 00 00 00 02 9C 42 4F 43 48 53 20  // 
> GTDT`.BOCHS
>  0010: 42 58 50 43 20 20 20 20 01 00 00 00 42 58 50 43  // BXPC
> BXPC
> -0020: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // 
> 
> +0020: 01 00 00 00 FF FF FF FF FF FF FF FF 00 00 00 00  // 
> 
>  0030: 1D 00 00 00 00 00 00 00 1E 00 00 00 04 00 00 00  // 
> 
>  0040: 1B 00 00 00 00 00 00 00 1A 00 00 00 00 00 00 00  // 
> 
> -0050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  // 
> 
> +0050: FF FF FF FF FF FF FF FF 00 00 00 00 00 00 00 00  // 
> 
>
> Signed-off-by: Miguel Luis 

Acked-by: Ani Sinha 

> ---
>  tests/data/acpi/virt/GTDT   | Bin 96 -> 96 bytes
>  tests/data/acpi/virt/GTDT.memhp | Bin 96 -> 96 bytes
>  tests/data/acpi/virt/GTDT.numamem   | Bin 96 -> 96 bytes
>  tests/qtest/bios-tables-test-allowed-diff.h |   3 ---
>  4 files changed, 3 deletions(-)
>
> diff --git a/tests/data/acpi/virt/GTDT b/tests/data/acpi/virt/GTDT
> index 
> 9408b71b59c0e0f2991c0053562280155b47bc0b..6f8cb9b8f30b55f4c93fe515982621e3db50feb2
>  100644
> GIT binary patch
> delta 45
> kcmYdD;BpUf2}xjJU|^avkxPo>KNL*VQ4xT#fs$YV0LH=;ng9R*
>
> delta 45
> jcmYdD;BpUf2}xjJU|{N*$R))AWPrg$9Tfo>8%6^Foy!E8
>
> diff --git a/tests/data/acpi/virt/GTDT.memhp b/tests/data/acpi/virt/GTDT.memhp
> index 
> 9408b71b59c0e0f2991c0053562280155b47bc0b..6f8cb9b8f30b55f4c93fe515982621e3db50feb2
>  100644
> GIT binary patch
> delta 45
> kcmYdD;BpUf2}xjJU|^avkxPo>KNL*VQ4xT#fs$YV0LH=;ng9R*
>
> delta 45
> jcmYdD;BpUf2}xjJU|{N*$R))AWPrg$9Tfo>8%6^Foy!E8
>
> diff --git a/tests/data/acpi/virt/GTDT.numamem 
> b/tests/data/acpi/virt/GTDT.numamem
> index 
> 9408b71b59c0e0f2991c0053562280155b47bc0b..6f8cb9b8f30b55f4c93fe515982621e3db50feb2
>  100644
> GIT binary patch
> delta 45
> kcmYdD;BpUf2}xjJU|^avkxPo>KNL*VQ4xT#fs$YV0LH=;ng9R*
>
> delta 45
> jcmYdD;BpUf2}xjJU|{N*$R))AWPrg$9Tfo>8%6^Foy!E8
>
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
> b/tests/qtest/bios-tables-test-allowed-diff.h
> index 957bd1b4f6..dfb8523c8b 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1,4 +1 @@
>  /* List of comma-separated changed AML files to ignore */
> -"tests/data/acpi/virt/GTDT",
> -"tests/data/acpi/virt/GTDT.memhp",
> -"tests/data/acpi/virt/GTDT.numamem",
> --
> 2.36.0
>
>

Re: [PATCH 1/3] tests/acpi: virt: allow acpi GTDT changes

2022-09-20 Thread Ani Sinha




On Tue, 20 Sep 2022, Miguel Luis wrote:

> Step 3 from bios-tables-test.c documented procedure.
>
> Signed-off-by: Miguel Luis 

Acked-by: Ani Sinha 

> ---
>  tests/qtest/bios-tables-test-allowed-diff.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/tests/qtest/bios-tables-test-allowed-diff.h 
> b/tests/qtest/bios-tables-test-allowed-diff.h
> index dfb8523c8b..957bd1b4f6 100644
> --- a/tests/qtest/bios-tables-test-allowed-diff.h
> +++ b/tests/qtest/bios-tables-test-allowed-diff.h
> @@ -1 +1,4 @@
>  /* List of comma-separated changed AML files to ignore */
> +"tests/data/acpi/virt/GTDT",
> +"tests/data/acpi/virt/GTDT.memhp",
> +"tests/data/acpi/virt/GTDT.numamem",
> --
> 2.36.0
>
>

Re: [PULL v2 0/9] loongarch-to-apply queue

2022-09-20 Thread gaosong




在 2022/9/21 上午2:33, Stefan Hajnoczi 写道:

Please remember to push your GPG key to the keyservers using gpg
send-keys YOUR_KEY_ID.

Thanks for your reminding.
I send the keys to  hkps://keys.openpgp.org  yesterday ,  but fogot to 
update the  identity information.


Thanks.
Song Gao

Re: [PATCH v6 2/2] i386: Add notify VM exit support

2022-09-20 Thread Chenyi Qiang





On 9/20/2022 9:59 PM, Peter Xu wrote:

On Tue, Sep 20, 2022 at 01:55:20PM +0800, Chenyi Qiang wrote:

@@ -5213,6 +5213,7 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run
*run)
   break;
   case KVM_EXIT_NOTIFY:
   ret = 0;
+warn_report_once("KVM: notify window was exceeded in guest");


Is there more informative way to dump this?  If it's 99% that the guest was
doing something weird and needs attention, maybe worthwhile to point that
out directly to the admin?



Do you mean to use other method to dump the info? i.e. printing a message is
not so clear. Or the output message ("KVM: notify window was exceeded in
guest") is not obvious and we need other wording.


I meant something like:

   KVM received notify exit.  It means there can be possible misbehaves in
   the guest, please have a look.


Get your point. Then I can print this message behind as well.

Thanks.



Or something similar.  What I'm worried is the admin may not understand
what's "notify window" and that message got simply ignored.

Though I am not even sure whether that's accurate in the wordings.




   if (run->notify.flags & KVM_NOTIFY_CONTEXT_INVALID) {
   warn_report("KVM: invalid context due to notify vmexit");
   if (has_triple_fault_event) {


Adding a warning looks good to me, with that (or in any better form of
wording):


If no objection, I'll follow Xiaoyao's suggestion to form the wording like:


No objection here.  Thanks.

Re: [PATCH] virtio-net: set the max of queue size to 4096

2022-09-20 Thread Jason Wang

On Tue, Sep 20, 2022 at 8:59 PM Michael S. Tsirkin  wrote:
>
> On Tue, Sep 20, 2022 at 10:03:23AM +0800, Jason Wang wrote:
> > On Tue, Sep 20, 2022 at 9:38 AM Jason Wang  wrote:
> > >
> > > On Tue, Sep 20, 2022 at 9:10 AM liuhaiwei  wrote:
> > > >
> > > > From: liuhaiwei 
> > > >
> > > > the limit of maximum of rx_queue_size and tx_queue to 1024 is so small 
> > > > as to affect our network performance when using the  virtio-net and 
> > > > vhost ,
> > > > we cannot set the maximum size beyond 1k.
> > > > why not enlarge the maximum size (such as 4096) when using the vhost 
> > > > backend?
> > >
> > > As Michael mentioned, there's a limitation in the kernel UIO_MAXIOV.
> > > We need to find way to overcome that limit first.
> >
> > Btw, this probably means the skb needs to be built by vhost-net
> > itself, instead of tuntap.
> >
> > Thanks
>
> That might help vhost-net but it won't help virtio-net.
>
> IMO the right fix is to add a separate limit on s/g size
> to the spec. Block and scsi already have it, seems
> reasonable to add it for others too.

I wonder if it would be simpler to tie the limit to the virtqueue size.

Having individual limits seems to complicate the migration compatibility anyhow.

Thanks

>
>
>
> > >
> > > Thanks
> > >
> > > >
> > > > Signed-off-by: liuhaiwei 
> > > > Signed-off-by: liuhaiwei 
> > > > ---
> > > >  hw/net/virtio-net.c| 47 +++---
> > > >  hw/virtio/virtio.c |  8 +--
> > > >  include/hw/virtio/virtio.h |  1 +
> > > >  3 files changed, 41 insertions(+), 15 deletions(-)
> > > >
> > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > index dd0d056fde..4b56484855 100644
> > > > --- a/hw/net/virtio-net.c
> > > > +++ b/hw/net/virtio-net.c
> > > > @@ -52,12 +52,11 @@
> > > >  #define MAX_VLAN(1 << 12)   /* Per 802.1Q definition */
> > > >
> > > >  /* previously fixed value */
> > > > -#define VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE 256
> > > > -#define VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE 256
> > > > +#define VIRTIO_NET_VHOST_USER_DEFAULT_SIZE 2048
> > > >
> > > >  /* for now, only allow larger queue_pairs; with virtio-1, guest can 
> > > > downsize */
> > > > -#define VIRTIO_NET_RX_QUEUE_MIN_SIZE VIRTIO_NET_RX_QUEUE_DEFAULT_SIZE
> > > > -#define VIRTIO_NET_TX_QUEUE_MIN_SIZE VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE
> > > > +#define VIRTIO_NET_RX_QUEUE_MIN_SIZE 256
> > > > +#define VIRTIO_NET_TX_QUEUE_MIN_SIZE 256
> > > >
> > > >  #define VIRTIO_NET_IP4_ADDR_SIZE   8/* ipv4 saddr + daddr */
> > > >
> > > > @@ -594,6 +593,28 @@ static int peer_has_ufo(VirtIONet *n)
> > > >  return n->has_ufo;
> > > >  }
> > > >
> > > > +static void virtio_net_set_default_queue_size(VirtIONet *n)
> > > > +{
> > > > +NetClientState *peer = n->nic_conf.peers.ncs[0];
> > > > +
> > > > +/* Default value is 0 if not set */
> > > > +if (n->net_conf.rx_queue_size == 0) {
> > > > +if (peer && peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> > > > +n->net_conf.rx_queue_size = 
> > > > VIRTIO_NET_VHOST_USER_DEFAULT_SIZE;
> > > > +} else {
> > > > +n->net_conf.rx_queue_size = VIRTIO_NET_VQ_MAX_SIZE;
> > > > +}
> > > > +}
> > > > +
> > > > +if (n->net_conf.tx_queue_size == 0) {
> > > > +if (peer && peer->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> > > > +n->net_conf.tx_queue_size = 
> > > > VIRTIO_NET_VHOST_USER_DEFAULT_SIZE;
> > > > +} else {
> > > > +n->net_conf.tx_queue_size = VIRTIO_NET_VQ_MAX_SIZE;
> > > > +}
> > > > +}
> > > > +}
> > > > +
> > > >  static void virtio_net_set_mrg_rx_bufs(VirtIONet *n, int 
> > > > mergeable_rx_bufs,
> > > > int version_1, int hash_report)
> > > >  {
> > > > @@ -633,7 +654,7 @@ static int virtio_net_max_tx_queue_size(VirtIONet 
> > > > *n)
> > > >   * size.
> > > >   */
> > > >  if (!peer) {
> > > > -return VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE;
> > > > +return VIRTIO_NET_VQ_MAX_SIZE;
> > > >  }
> > > >
> > > >  switch(peer->info->type) {
> > > > @@ -641,7 +662,7 @@ static int virtio_net_max_tx_queue_size(VirtIONet 
> > > > *n)
> > > >  case NET_CLIENT_DRIVER_VHOST_VDPA:
> > > >  return VIRTQUEUE_MAX_SIZE;
> > > >  default:
> > > > -return VIRTIO_NET_TX_QUEUE_DEFAULT_SIZE;
> > > > +return VIRTIO_NET_VQ_MAX_SIZE;
> > > >  };
> > > >  }
> > > >
> > > > @@ -3450,30 +3471,30 @@ static void 
> > > > virtio_net_device_realize(DeviceState *dev, Error **errp)
> > > >
> > > >  virtio_net_set_config_size(n, n->host_features);
> > > >  virtio_init(vdev, VIRTIO_ID_NET, n->config_size);
> > > > -
> > > > +virtio_net_set_default_queue_size(n);
> > > >  /*
> > > >   * We set a lower limit on RX queue size to what it always was.
> > > >   * Guests that want a smaller ring can always resize it without
> > > >   * help from us (using virtio 1 and up).
> > > >

Re: [PATCH] vhost-user-blk: fix the resize crash

2022-09-20 Thread Raphael Norwitz

>If the os is not installed and doesn't have the virtio guest driver,

>the vhost dev isn't started, so the dev->vdev is NULL.

>

>Reproduce: mount a Win 2019 iso, go into the install ui, then resize

>the virtio-blk device, qemu crash.

>

>Signed-off-by: Li Feng fen...@smartx.com



Reviewed-by: Raphael Norwitz 



>---

> hw/block/vhost-user-blk.c | 4 

> 1 file changed, 4 insertions(+)

>

>diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c

>index 9117222456..db30bb754f 100644

>--- a/hw/block/vhost-user-blk.c

>+++ b/hw/block/vhost-user-blk.c

>@@ -95,6 +95,10 @@ static int vhost_user_blk_handle_config_change(struct 
>vhost_dev *dev)

> VHostUserBlk *s = VHOST_USER_BLK(dev->vdev);

> Error *local_err = NULL;

>

>+if (!dev->started) {

>+return 0;

>+}

>+

> ret = vhost_dev_get_config(dev, (uint8_t *),

>sizeof(struct virtio_blk_config),

>_err);

>--

>2.37.3

>

Re: [PATCH] ratelimit: restrict the delay time to a non-negative value

2022-09-20 Thread Wang Liang

On Tue, 2022-09-20 at 13:18 +, Alberto Garcia wrote:
> On Tue 20 Sep 2022 08:33:50 PM +08, wanglian...@126.com wrote:
> > From: Wang Liang 
> > 
> > The delay time should never be a negative value.
> > 
> > -return limit->slice_end_time - now;
> > +return MAX(limit->slice_end_time - now, 0);
> 
> How can this be negative? slice_end_time is guaranteed to be larger
> than
> now:
> 
> if (limit->slice_end_time < now) {
> /* Previous, possibly extended, time slice finished; reset
> the
>  * accounting. */
> limit->slice_start_time = now;
> limit->slice_end_time = now + limit->slice_ns;
> limit->dispatched = 0;
> }
> 
This is just a guarantee. 

If slice_end_time is assigned later by
limit->slice_end_time = limit->slice_start_time +
(uint64_t)(delay_slices * limit->slice_ns);
There may be precision issues at that time.

> Berto

RE: [PATCH] hw/xen: set pci Atomic Ops requests for passthrough device

2022-09-20 Thread Ji, Ruili

[AMD Official Use Only - General]

Hi Paul and AFAIK:

Thanks for your help.
When could we see this patch on the master branch? 
Our project urgently needs this solution.

Thanks!
Ruili

-Original Message-
From: Paul Durrant
Subject: RE: [PATCH] hw/xen: set pci Atomic Ops requests for passthrough device
On 14/09/2022 03:07, Ji, Ruili wrote:
[AMD Official Use Only - General]

Hi Paul,

Thank you!
But how could we merge this patch ?


AFAIK Anthony (anthony.per...@citrix.com) still deals with this.

Cheers,

  Paul

-Original Message-
From: Ji, Ruili
Sent: 2022年9月14日 18:08
To: Paul Durrant ; qemu-devel@nongnu.org
Cc: Liu, Aaron ; xen-de...@lists.xenproject.org
Subject: RE: [PATCH] hw/xen: set pci Atomic Ops requests for passthrough device

Hi Paul,

Thank you!
But how could we merge this patch ?

Ruili
-Original Message-
From: Paul Durrant 
Sent: 2022年9月14日 17:08
To: Ji, Ruili ; qemu-devel@nongnu.org
Cc: Liu, Aaron ; xen-de...@lists.xenproject.org
Subject: Re: [PATCH] hw/xen: set pci Atomic Ops requests for passthrough device

Caution: This message originated from an External Source. Use proper caution 
when opening attachments, clicking links, or responding.


On 13/09/2022 04:02, Ji, Ruili wrote:
> [AMD Official Use Only - General]
>
>
> Hi Paul,
>
> Could you help to review this patch?
>

LGTM. You can add my R-b to it.

   Paul

> Thanks
>
> *From:* Ji, Ruili
> *Sent:* 2022年9月7日 9:04
> *To:* 'Paul Durrant' ; 'qemu-devel@nongnu.org'
> 
> *Cc:* Liu, Aaron ; 'xen-de...@lists.xenproject.org'
> 
> *Subject:* RE: [PATCH] hw/xen: set pci Atomic Ops requests for
> passthrough device
>
> FYI
>
> *From:* Ji, Ruili
> *Sent:* 2022年9月6日 15:40
> *To:* qemu-devel@nongnu.org 
> *Cc:* Liu, Aaron mailto:aaron@amd.com>>
> *Subject:* [PATCH] hw/xen: set pci Atomic Ops requests for passthrough
> device
>
>  From c54e0714a1e1cac7dc416bd843b9ec7162bcfc47 Mon Sep 17 00:00:00
> 2001
>
> From: Ruili Ji ruili...@amd.com 
>
> Date: Tue, 6 Sep 2022 14:09:41 +0800
>
> Subject: [PATCH] hw/xen: set pci Atomic Ops requests for passthrough
> device
>
> Make guest os access pci device control 2 reg for passthrough device
>
> as struct XenPTRegInfo described in the file hw/xen/xen_pt.h.
>
> /* reg read only field mask (ON:RO/ROS, OFF:other) */
>
> uint32_t ro_mask;
>
> /* reg emulate field mask (ON:emu, OFF:passthrough) */
>
> uint32_t emu_mask;
>
> Resolves:
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitl
> ab.com%2Fqemu-project%2Fqemu%2F-%2Fissues%2F1196data=05%7C01%7CRu
> ili.Ji%40amd.com%7Ca5e2c22a81544feb6bb408da96309702%7C3dd8961fe4884e60
> 8e11a82d994e183d%7C0%7C0%7C637987432689748212%7CUnknown%7CTWFpbGZsb3d8
> eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3
> 000%7C%7C%7Csdata=Jg8588FWkIZzmSEyt50TYCbck2NuoVJdm7ZP0Z%2FtFGc%3
> Dreserved=0
>  lab.com%2Fqemu-project%2Fqemu%2F-%2Fissues%2F1196data=05%7C01%7CR
> uili.Ji%40amd.com%7Ca5e2c22a81544feb6bb408da96309702%7C3dd8961fe4884e6
> 08e11a82d994e183d%7C0%7C0%7C637987432689748212%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C
> 3000%7C%7C%7Csdata=Jg8588FWkIZzmSEyt50TYCbck2NuoVJdm7ZP0Z%2FtFGc%
> 3Dreserved=0>
>
> Signed-off-by: aaron@amd.com 
>
> Signed-off-by: ruili...@amd.com 
>
> ---
>
> hw/xen/xen_pt_config_init.c | 4 ++--
>
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
>
> index c5c4e943a8..adc565a00a 100644
>
> --- a/hw/xen/xen_pt_config_init.c
>
> +++ b/hw/xen/xen_pt_config_init.c
>
> @@ -985,8 +985,8 @@ static XenPTRegInfo xen_pt_emu_reg_pcie[] = {
>
>   .offset = 0x28,
>
>   .size   = 2,
>
>   .init_val   = 0x,
>
> -.ro_mask= 0xFFE0,
>
> -.emu_mask   = 0x,
>
> +.ro_mask= 0xFFA0,
>
> +.emu_mask   = 0xFFBF,
>
>   .init   = xen_pt_devctrl2_reg_init,
>
>   .u.w.read   = xen_pt_word_reg_read,
>
>   .u.w.write  = xen_pt_word_reg_write,
>
> --
>
> 2.34.1
>

Re: [PATCH 49/51] io/channel-watch: Fix socket watch on Windows

2022-09-20 Thread Bin Meng

On Wed, Sep 14, 2022 at 4:08 PM Bin Meng  wrote:
>
> On Wed, Sep 7, 2022 at 1:07 PM Bin Meng  wrote:
> >
> > Hi Clément,
> >
> > On Tue, Sep 6, 2022 at 8:06 PM Clément Chigot  wrote:
> > >
> > > > > > I checked your patch, what you did seems to be something one would
> > > > > > naturally write, but what is currently in the QEMU sources seems to 
> > > > > > be
> > > > > > written intentionally.
> > > > > >
> > > > > > +Paolo Bonzini , you are the one who implemented the socket watch on
> > > > > > Windows. Could you please help analyze this issue?
> > > > > >
> > > > > > > to avoid WSAEnumNetworkEvents for the master GSource which only 
> > > > > > > has
> > > > > > > G_IO_HUP (or for any GSource having only that).
> > > > > > > As I said above, the current code doesn't do anything with it 
> > > > > > > anyway.
> > > > > > > So, IMO, it's safe to do so.
> > > > > > >
> > > > > > > I'll send you my patch attached. I was planning to send it in the 
> > > > > > > following
> > > > > > > weeks anyway. I was just waiting to be sure everything looks fine 
> > > > > > > on our
> > > > > > > CI. Feel free to test and modify it if needed.
> > > > > >
> > > > > > I tested your patch. Unfortunately there is still one test case
> > > > > > (migration-test.exe) throwing up the "Broken pipe" message.
> > > > >
> > > > > I must say I didn't fully test it against qemu testsuite yet. Maybe 
> > > > > there are
> > > > > some refinements to be done. "Broken pipe" might be linked to the 
> > > > > missing
> > > > > G_IO_HUP support.
> > > > >
> > > > > > Can you test my patch instead to see if your gdb issue can be fixed?
> > > > >
> > > > > Yeah sure. I'll try to do it this afternoon.
> > >
> > > I can't explain how mad at me I am... I'm pretty sure your patch was the 
> > > first
> > > thing I've tried when I encountered this issue. But it wasn't working
> > > or IIRC the
> > > issue went away but that was because the polling was actually disabled 
> > > (looping
> > > indefinitely)...I'm suspecting that I already had changed the CreateEvent 
> > > for
> > > WSACreateEvent which forces you to handle the reset.
> > > Finally, I end up struggling reworking the whole check function...
> > > But yeah, your patch does work fine on my gdb issues too.
> >
> > Good to know this patch works for you too.
> >
> > > And I guess the events are reset when recv() is being called because of 
> > > the
> > > auto-reset feature set up by CreateEvent().
> > > IIUC, what Marc-André means by busy loop is the polling being looping
> > > indefinitely as I encountered. I can ensure that this patch doesn't do 
> > > that.
> > > It can be easily checked by setting the env variable G_MAIN_POLL_DEBUG.
> > > It'll show what g_poll is doing and it's normally always available on
> > > Windows.
> > >
> > > Anyway, we'll wait for Paolo to see if he remembers why he had to call
> > > WSAEnumNetworkEvents. Otherwize, let's go for your patch. Mine might
> > > be a good start to improve the whole polling on Windows but if it doesn't
> > > work in your case, it then needs some refinements.
> > >
> >
> > Yeah, this issue bugged me quite a lot. If we want to reset the event
> > in qio_channel_socket_source_check(), we will have to do the following
> > to make sure qtests are happy.
> >
> > diff --git a/io/channel-watch.c b/io/channel-watch.c
> > index 43d38494f7..f1e1650b81 100644
> > --- a/io/channel-watch.c
> > +++ b/io/channel-watch.c
> > @@ -124,8 +124,6 @@ qio_channel_socket_source_check(GSource *source)
> > return 0;
> > }
> > - WSAEnumNetworkEvents(ssource->socket, ssource->ioc->event, );
> > -
> > FD_ZERO();
> > FD_ZERO();
> > FD_ZERO();
> > @@ -153,6 +151,10 @@ qio_channel_socket_source_check(GSource *source)
> > ssource->revents |= G_IO_PRI;
> > }
> > + if (ssource->revents) {
> > + WSAEnumNetworkEvents(ssource->socket, ssource->ioc->event, );
> > + }
> > +
> > return ssource->revents;
> > }
> >
> > Removing "if (ssource->revents)" won't work.
> >
> > It seems to me that resetting the event twice (one time with the
> > master Gsource, and the other time with the child GSource) causes some
> > bizarre behavior. But MSDN [1] says
> >
> > "Resetting an event that is already reset has no effect."
> >
> > [1] 
> > https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-resetevent
> >
>
> Paolo, any comments about this issue?
>

v2 series has been sent out, and this patch remains unchanged.

Paolo, still would appreciate your comments.

Regards,
Bin

[PATCH 10/14] migration: Make PageSearchStatus part of RAMState

2022-09-20 Thread Peter Xu

We used to allocate PSS structure on the stack for precopy when sending
pages.  Make it static, so as to describe per-channel ram migration status.

Here we declared RAM_CHANNEL_MAX instances, preparing for postcopy to use
it, even though this patch has not yet to start using the 2nd instance.

This should not have any functional change per se, but it already starts to
export PSS information via the RAMState, so that e.g. one PSS channel can
start to reference the other PSS channel.

Always protect PSS access using the same RAMState.bitmap_mutex.  We already
do so, so no code change needed, just some comment update.  Maybe we should
consider renaming bitmap_mutex some day as it's going to be a more commonly
and big mutex we use for ram states, but just leave it for later.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 112 ++--
 1 file changed, 61 insertions(+), 51 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index b4b36ca59e..dbe11e1ace 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -85,6 +85,46 @@
 
 XBZRLECacheStats xbzrle_counters;
 
+/* used by the search for pages to send */
+struct PageSearchStatus {
+/* The migration channel used for a specific host page */
+QEMUFile*pss_channel;
+/* Current block being searched */
+RAMBlock*block;
+/* Current page to search from */
+unsigned long page;
+/* Set once we wrap around */
+bool complete_round;
+/*
+ * [POSTCOPY-ONLY] Whether current page is explicitly requested by
+ * postcopy.  When set, the request is "urgent" because the dest QEMU
+ * threads are waiting for us.
+ */
+bool postcopy_requested;
+/*
+ * [POSTCOPY-ONLY] The target channel to use to send current page.
+ *
+ * Note: This may _not_ match with the value in postcopy_requested
+ * above. Let's imagine the case where the postcopy request is exactly
+ * the page that we're sending in progress during precopy. In this case
+ * we'll have postcopy_requested set to true but the target channel
+ * will be the precopy channel (so that we don't split brain on that
+ * specific page since the precopy channel already contains partial of
+ * that page data).
+ *
+ * Besides that specific use case, postcopy_target_channel should
+ * always be equal to postcopy_requested, because by default we send
+ * postcopy pages via postcopy preempt channel.
+ */
+bool postcopy_target_channel;
+/* Whether we're sending a host page */
+bool  host_page_sending;
+/* The start/end of current host page.  Invalid if 
host_page_sending==false */
+unsigned long host_page_start;
+unsigned long host_page_end;
+};
+typedef struct PageSearchStatus PageSearchStatus;
+
 /* struct contains XBZRLE cache and a static page
used by the compression */
 static struct {
@@ -319,6 +359,11 @@ typedef struct {
 struct RAMState {
 /* QEMUFile used for this migration */
 QEMUFile *f;
+/*
+ * PageSearchStatus structures for the channels when send pages.
+ * Protected by the bitmap_mutex.
+ */
+PageSearchStatus pss[RAM_CHANNEL_MAX];
 /* UFFD file descriptor, used in 'write-tracking' migration */
 int uffdio_fd;
 /* Last block that we have visited searching for dirty pages */
@@ -362,7 +407,12 @@ struct RAMState {
 uint64_t target_page_count;
 /* number of dirty bits in the bitmap */
 uint64_t migration_dirty_pages;
-/* Protects modification of the bitmap and migration dirty pages */
+/*
+ * Protects:
+ * - dirty/clear bitmap
+ * - migration_dirty_pages
+ * - pss structures
+ */
 QemuMutex bitmap_mutex;
 /* The RAMBlock used in the last src_page_requests */
 RAMBlock *last_req_rb;
@@ -444,46 +494,6 @@ void dirty_sync_missed_zero_copy(void)
 ram_counters.dirty_sync_missed_zero_copy++;
 }
 
-/* used by the search for pages to send */
-struct PageSearchStatus {
-/* The migration channel used for a specific host page */
-QEMUFile*pss_channel;
-/* Current block being searched */
-RAMBlock*block;
-/* Current page to search from */
-unsigned long page;
-/* Set once we wrap around */
-bool complete_round;
-/*
- * [POSTCOPY-ONLY] Whether current page is explicitly requested by
- * postcopy.  When set, the request is "urgent" because the dest QEMU
- * threads are waiting for us.
- */
-bool postcopy_requested;
-/*
- * [POSTCOPY-ONLY] The target channel to use to send current page.
- *
- * Note: This may _not_ match with the value in postcopy_requested
- * above. Let's imagine the case where the postcopy request is exactly
- * the page that we're sending in progress during precopy. In this case
- * we'll have postcopy_requested set to true but the target channel
- * will be the precopy channel (so that we

[PATCH 05/14] migration: Yield bitmap_mutex properly when sending/sleeping

2022-09-20 Thread Peter Xu

Don't take the bitmap mutex when sending pages, or when being throttled by
migration_rate_limit() (which is a bit tricky to call it here in ram code,
but seems still helpful).

It prepares for the possibility of concurrently sending pages in >1 threads
using the function ram_save_host_page() because all threads may need the
bitmap_mutex to operate on bitmaps, so that either sendmsg() or any kind of
qemu_sem_wait() blocking for one thread will not block the other from
progressing.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 42 +++---
 1 file changed, 31 insertions(+), 11 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 8303252b6d..6e7de6087a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2463,6 +2463,7 @@ static void postcopy_preempt_reset_channel(RAMState *rs)
  */
 static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
 {
+bool page_dirty, release_lock = postcopy_preempt_active();
 int tmppages, pages = 0;
 size_t pagesize_bits =
 qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
@@ -2486,22 +2487,41 @@ static int ram_save_host_page(RAMState *rs, 
PageSearchStatus *pss)
 break;
 }
 
+page_dirty = migration_bitmap_clear_dirty(rs, pss->block, pss->page);
+/*
+ * Properly yield the lock only in postcopy preempt mode because
+ * both migration thread and rp-return thread can operate on the
+ * bitmaps.
+ */
+if (release_lock) {
+qemu_mutex_unlock(>bitmap_mutex);
+}
+
 /* Check the pages is dirty and if it is send it */
-if (migration_bitmap_clear_dirty(rs, pss->block, pss->page)) {
+if (page_dirty) {
 tmppages = ram_save_target_page(rs, pss);
-if (tmppages < 0) {
-return tmppages;
+if (tmppages >= 0) {
+pages += tmppages;
+/*
+ * Allow rate limiting to happen in the middle of huge pages if
+ * something is sent in the current iteration.
+ */
+if (pagesize_bits > 1 && tmppages > 0) {
+migration_rate_limit();
+}
 }
+} else {
+tmppages = 0;
+}
 
-pages += tmppages;
-/*
- * Allow rate limiting to happen in the middle of huge pages if
- * something is sent in the current iteration.
- */
-if (pagesize_bits > 1 && tmppages > 0) {
-migration_rate_limit();
-}
+if (release_lock) {
+qemu_mutex_lock(>bitmap_mutex);
 }
+
+if (tmppages < 0) {
+return tmppages;
+}
+
 pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
 } while ((pss->page < hostpage_boundary) &&
  offset_in_ramblock(pss->block,
-- 
2.32.0

[PATCH 06/14] migration: Use atomic ops properly for page accountings

2022-09-20 Thread Peter Xu

To prepare for thread-safety on page accountings, at least below counters
need to be accessed only atomically, they are:

ram_counters.transferred
ram_counters.duplicate
ram_counters.normal
ram_counters.postcopy_bytes

There are a lot of other counters but they won't be accessed outside
migration thread, then they're still safe to be accessed without atomic
ops.

Signed-off-by: Peter Xu 
---
 migration/migration.c | 10 +-
 migration/multifd.c   |  2 +-
 migration/ram.c   | 29 +++--
 3 files changed, 21 insertions(+), 20 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 07c74a79a2..0eacc0c99b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1048,13 +1048,13 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 
 info->has_ram = true;
 info->ram = g_malloc0(sizeof(*info->ram));
-info->ram->transferred = ram_counters.transferred;
+info->ram->transferred = qatomic_read(_counters.transferred);
 info->ram->total = ram_bytes_total();
-info->ram->duplicate = ram_counters.duplicate;
+info->ram->duplicate = qatomic_read(_counters.duplicate);
 /* legacy value.  It is not used anymore */
 info->ram->skipped = 0;
-info->ram->normal = ram_counters.normal;
-info->ram->normal_bytes = ram_counters.normal * page_size;
+info->ram->normal = qatomic_read(_counters.normal);
+info->ram->normal_bytes = info->ram->normal * page_size;
 info->ram->mbps = s->mbps;
 info->ram->dirty_sync_count = ram_counters.dirty_sync_count;
 info->ram->dirty_sync_missed_zero_copy =
@@ -1065,7 +1065,7 @@ static void populate_ram_info(MigrationInfo *info, 
MigrationState *s)
 info->ram->pages_per_second = s->pages_per_second;
 info->ram->precopy_bytes = ram_counters.precopy_bytes;
 info->ram->downtime_bytes = ram_counters.downtime_bytes;
-info->ram->postcopy_bytes = ram_counters.postcopy_bytes;
+info->ram->postcopy_bytes = qatomic_read(_counters.postcopy_bytes);
 
 if (migrate_use_xbzrle()) {
 info->has_xbzrle_cache = true;
diff --git a/migration/multifd.c b/migration/multifd.c
index 586ddc9d65..460326acd4 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -437,7 +437,7 @@ static int multifd_send_pages(QEMUFile *f)
 + p->packet_len;
 qemu_file_acct_rate_limit(f, transferred);
 ram_counters.multifd_bytes += transferred;
-ram_counters.transferred += transferred;
+qatomic_add(_counters.transferred, transferred);
 qemu_mutex_unlock(>mutex);
 qemu_sem_post(>sem);
 
diff --git a/migration/ram.c b/migration/ram.c
index 6e7de6087a..5bd3d76bf0 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -432,11 +432,11 @@ static void ram_transferred_add(uint64_t bytes)
 if (runstate_is_running()) {
 ram_counters.precopy_bytes += bytes;
 } else if (migration_in_postcopy()) {
-ram_counters.postcopy_bytes += bytes;
+qatomic_add(_counters.postcopy_bytes, bytes);
 } else {
 ram_counters.downtime_bytes += bytes;
 }
-ram_counters.transferred += bytes;
+qatomic_add(_counters.transferred, bytes);
 }
 
 void dirty_sync_missed_zero_copy(void)
@@ -725,7 +725,7 @@ void mig_throttle_counter_reset(void)
 
 rs->time_last_bitmap_sync = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
 rs->num_dirty_pages_period = 0;
-rs->bytes_xfer_prev = ram_counters.transferred;
+rs->bytes_xfer_prev = qatomic_read(_counters.transferred);
 }
 
 /**
@@ -1085,8 +1085,9 @@ uint64_t ram_pagesize_summary(void)
 
 uint64_t ram_get_total_transferred_pages(void)
 {
-return  ram_counters.normal + ram_counters.duplicate +
-compression_counters.pages + xbzrle_counters.pages;
+return  qatomic_read(_counters.normal) +
+qatomic_read(_counters.duplicate) +
+compression_counters.pages + xbzrle_counters.pages;
 }
 
 static void migration_update_rates(RAMState *rs, int64_t end_time)
@@ -1145,8 +1146,8 @@ static void migration_trigger_throttle(RAMState *rs)
 {
 MigrationState *s = migrate_get_current();
 uint64_t threshold = s->parameters.throttle_trigger_threshold;
-
-uint64_t bytes_xfer_period = ram_counters.transferred - 
rs->bytes_xfer_prev;
+uint64_t bytes_xfer_period =
+qatomic_read(_counters.transferred) - rs->bytes_xfer_prev;
 uint64_t bytes_dirty_period = rs->num_dirty_pages_period * 
TARGET_PAGE_SIZE;
 uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100;
 
@@ -1285,7 +1286,7 @@ static int save_zero_page(RAMState *rs, RAMBlock *block, 
ram_addr_t offset)
 int len = save_zero_page_to_file(rs, rs->f, block, offset);
 
 if (len) {
-ram_counters.duplicate++;
+qatomic_inc(_counters.duplicate);
 ram_transferred_add(len);
 return 1;
 }
@@ -1322,9 +1323,9 @@ static bool control_save_page(RAMState *rs, RAMBlock 
*block, ram_addr_t offset,
 }

[PATCH 04/14] migration: Remove RAMState.f references in compression code

2022-09-20 Thread Peter Xu

Removing referencing to RAMState.f in compress_page_with_multi_thread() and
flush_compressed_data().

Compression code by default isn't compatible with having >1 channels (or it
won't currently know which channel to flush the compressed data), so to
make it simple we always flush on the default to_dst_file port until
someone wants to add >1 ports support, as rs->f right now can really
change (after postcopy preempt is introduced).

There should be no functional change at all after patch applied, since as
long as rs->f referenced in compression code, it must be to_dst_file.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 12 +++-
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 62ff2c1469..8303252b6d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1461,6 +1461,7 @@ static bool save_page_use_compression(RAMState *rs);
 
 static void flush_compressed_data(RAMState *rs)
 {
+MigrationState *ms = migrate_get_current();
 int idx, len, thread_count;
 
 if (!save_page_use_compression(rs)) {
@@ -1479,7 +1480,7 @@ static void flush_compressed_data(RAMState *rs)
 for (idx = 0; idx < thread_count; idx++) {
 qemu_mutex_lock(_param[idx].mutex);
 if (!comp_param[idx].quit) {
-len = qemu_put_qemu_file(rs->f, comp_param[idx].file);
+len = qemu_put_qemu_file(ms->to_dst_file, comp_param[idx].file);
 /*
  * it's safe to fetch zero_page without holding comp_done_lock
  * as there is no further request submitted to the thread,
@@ -1498,11 +1499,11 @@ static inline void set_compress_params(CompressParam 
*param, RAMBlock *block,
 param->offset = offset;
 }
 
-static int compress_page_with_multi_thread(RAMState *rs, RAMBlock *block,
-   ram_addr_t offset)
+static int compress_page_with_multi_thread(RAMBlock *block, ram_addr_t offset)
 {
 int idx, thread_count, bytes_xmit = -1, pages = -1;
 bool wait = migrate_compress_wait_thread();
+MigrationState *ms = migrate_get_current();
 
 thread_count = migrate_compress_threads();
 qemu_mutex_lock(_done_lock);
@@ -1510,7 +1511,8 @@ retry:
 for (idx = 0; idx < thread_count; idx++) {
 if (comp_param[idx].done) {
 comp_param[idx].done = false;
-bytes_xmit = qemu_put_qemu_file(rs->f, comp_param[idx].file);
+bytes_xmit = qemu_put_qemu_file(ms->to_dst_file,
+comp_param[idx].file);
 qemu_mutex_lock(_param[idx].mutex);
 set_compress_params(_param[idx], block, offset);
 qemu_cond_signal(_param[idx].cond);
@@ -2263,7 +2265,7 @@ static bool save_compress_page(RAMState *rs, RAMBlock 
*block, ram_addr_t offset)
 return false;
 }
 
-if (compress_page_with_multi_thread(rs, block, offset) > 0) {
+if (compress_page_with_multi_thread(block, offset) > 0) {
 return true;
 }
 
-- 
2.32.0

[PATCH] qboot: rebuild based on latest commit

2022-09-20 Thread Jason A. Donenfeld

df22fbb751 ("qboot: update to latest submodule") updated the qboot
submodule from a5300c49 to 8ca302e8. However, qboot isn't built during
the QEMU's build process but rather is included in binary form. So
rebuild it here.

Cc: Paolo Bonzini 
Signed-off-by: Jason A. Donenfeld 
---
Paolo - I have no idea what the procedure for doing this is. If you'd
prefer to rebuild this yourself, that'd make sense to me, since the
binary diff is kind of hard to verify. -Jason

 pc-bios/qboot.rom | Bin 65536 -> 65536 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)

diff --git a/pc-bios/qboot.rom b/pc-bios/qboot.rom
index 
7634106a0766913077e88dfcb1021c1168dcad3c..684000f57aad8925c81890d97345500eb266827e
 100644
GIT binary patch
delta 8383
zcmchceOy#!+Q-isW=2qS&=7$cWJEAgG+$6bPzL0vp(vbHG*y1+hfKa+3^oq-pmUvx+$>)!2D{_Zb+_t$!arAJ6x9uIqlk
zulwbU7K_?qQJc-;cUD)6)g2;T4db|tnX$U>BRCxw$Vu_x9JkXrSI1e_#_G708fjGm
z$4OHScpU7C$LHhlsAjK6jCeeijK{Q|c)X$CVm_sE%sCj%=c=@MqTm(lX-$bm@U$
zwOF4gQm#72@RTgey2ECVO=z_`Ro%V89H%?%kdk07p>gWbd^qVs!}v2~Q;+O+=2uy^
zvn0}sq|=NDEy4SJ8N-+(Hp}PGeeCd;Xum7VpPpd-bcc(AN9sk3HRt$P8l!bY-Soa{71YV{SUDEg%3J4s^n`lH}A|FCe1jcmF57cq2
z;GaDu&3EbbKyycD<0=xoTH$DhUcxSM=sMx?v0-62m}Gnn+}=}z4&6qJ(Ogkz
zMTe@Kf#M0D{W9GMir}l+^O4=YG(io%RJQrvwhpzGv8lZbuP^T!-#bbR6532pw3|Zw
zrdhPQ>yD|_b99_NfpGjwSotk
zx#lr2(ZX((Dj+I2T6Jm;jSr49D4EROswOR8Z{ymUtwv0+f8trFjuS>1swmdveX)pKZE_lP0SVK|$6*~}O@uXwIf8O|ZUv=6b>
zZ4?E!XqLom7t3Mo9y_})%MoQ0+-!nxBdN{ovI$aq=5s%{XU6=*+5_3B)FYK-Mx0<&
ztOYGxzDC-KWcDu1d5}4i=x%po|6(S4-IMMQEFG2#CoN;nAIa1s4q=_wL*NZm-y@V9
zJ#>odkEf@{uqXpon~k%XJ;}#)f1ZoB+Cy0G=TMI1p>$nXR?<(hjEF>Z4!uWqh91iGX>9MPy*wHhY^V^z)4k+m8Z6as{+)ksk6uU6ys
zt59TRfA@P;R#h1`Zu~5K_xTFdgM4=v1JZ=-}#dG3uJC+eA5r!BBUyE8%oY>oo
zr@b3<9%RL^F=8VixYs*YF7!>ncIJpy91Uz-P-cTtW-pd4O)0ZMDYH-%_K4>F?VT2d
zL!#MUEbQ@B*rHVE!PD0l)<3BkJB%+_5?M9ArhBXf9u_RxJMGihhdFoPftj+a_U|2h
zgpCy@N<1fDl`h{_X4YEhPVx=OHDr{QD#=~G-d$jpd5XpDtFMMxvJ{KQ-bWb$W=Vnt
zQN-7af@H_G%xu24-4gtU3p)0C_-Htt;lM3Oq33Zj6k-hwY04f##7NwS_No
ztc-@UExIceY>r36X6R8ppA`FZ6*L!W0|dI*n3ZIf}F6Sm2}xRt?+m
z*iONF?;HN`;E8NHuQqqUUGk>(MNYmf3P(
z!ztF3x2~W`=3j_v?y{e|=H=2sZ?N$eN{s5qFQeH}edjOhf~|iRdMgWFYk~L0
zm%gtofyuX5rNUfU?)Is~u2-TRT*GuOFa!Q|RY(~!U6I$H-$Yk^ZnC>>YY?3A=O
z=v>q=Wef6cG3Ev_L{Y^#>LYuQ(T+2d6EBFD#j0EHwt2#jIDW^IaOC{`#rUS
zOO3)yja@N^2>TJX?grY_*HGxaKV@@(oZVnEH)0z_Qd_o5JXPgX@yes2_))b@a3aJn
zO6ln2wGUAyN@5AkE)hGmifz|OZOaM7RU@b#i*fyi-%k|M&1MQNXK}^aGb=jA@SA0Y
zAeC*DfnJLq)W3ZTcd>IM4O5^DRF$O8D|cXdl_Mg|0;Ro?nEHA)fAB#7=<~+j^&=
zUwt_dK3>6QO`1>fF$*$2$5xD$
zQ*5sFS>3+(lO`=D1xqlnf*tCWAtr=iz0YTLgjC>W$yFMd@d%xcna2;Of|O-=QP@
z4cY`|{u^EDKZehw-m!(alf52WhU?&%xJ$}TQWqFS{R6Vw`}+TL(+=UOLxQQP~
zEdv{R4nVx<4&(g9bm7)iI*7iGujU`4cLq%`om2K~b_c5P1fb)}3H(WiaTyPzax(mH
zhjArKiQ7IWy+-{84~p8`;o>ZCWiZz^7@Hq29Y{+C4_zG3n(#52z-q!xM7d;^`m*h0
z%s^%=<5S;U_iBDc5S4gZjeWlY-4Qq}hX)~4wAHeYEhy_b*@k$Ouxo@)iS
z3731^ew8q)9BHDT6aF@G5G$;?&$@z^mo1L?{GgK>X5pGO(jzz*B^5$aQlk#vj_3O<
zhzuG=nxUWauhHj2XY%8z|3gozHm1?$hYGq0mv9`$AIF{`QP{A@n0dvp<=up)4)cv+
z%lXx`aQJwHWa;n`iTQ}Qp1PyvTgA8|C1FIaxiMKdY>RVv(OkC)gLBJLa4Rmq7ZOy)
zu{r5Q>Jdi%CdG}I%)duVM`ZD@P{oM2Zh6RUtqwusuh6v-$*SDgsIuw*;P>U@
zux(z0BA4Z-qe$Gr<$P?}jg?7kS^p}wXqOTz(D=)acd6Sew9BnGX26vnq
zM~3IIduy=}$}SW0TWr}OHgm8|2(-F`XG(ckjmVBKH1-MB_Jy;~E#VeYECjP0*)2A6
zVDWk*h8j6?5mEJUN<;h^eLktn!ABEC{-J*OD%j);z*a^FM%wJVmG#ESC9@AM7j`ny
z>{{>0|AKmr9-vMZyw4v@9sL8(yXm2%EdD!Mlk_nE7F8$BFiu9~nwm9s7LmS_i+?HB
zmhD8N&|+yHB_${ElWBSK4}DSu!}I^fsAzWdV5kdYGQ4HDmi1rm%54%5I`3?CDH@
z1kInY3rxSqbc5AlzV20Tax;JfiY-})~y#|ut&3Lj>9nE
zCSxsdak`>>mQMM!sZgBbksiIxvzg+;*j2{BjdKyB#dRlEOtn_abgt
zP8Oe*f<{$7g2#!yMf#?RW%D%mXnTK$a>waMeTx#5WJ57IzD@tntk21a9!z}h-
z>$}!-Xq?zvO=aVZDe_sRJT|Adu%HonZJ<{r*>2gPcFO<+~%H;{MS6LaTx
zxO4BxuT}DUB#V-OeV-%?P8KKb6ru5XT=?9Bg~k*iaD_(th8<1p3$C#^ScMfLvfJSa
zvX39q%f@_o~1kE(?Wj1u-LL2ZRYdLs69&))28r$rPtD8`B8K@EuKG0
zSJMWY#-a_rMgIhrxQwr>hAL;5Dya*mkzE#IVHCa10VSfhc(LcP$Nw<@1PX5dE17$6
zhG{IXdg~Z%PVcLF`xxy5=op;^-Z)0rfgQ)FM@C=OhsS7WMojuPhb*)w_su2)Uy
zu2H4e1gq5?J4o2ap#%O=o&8G{x8gU#|4-VI5zYUT4rD~Dv{lq^RILh=AWqN
zglOz+!}0tc7QKdDi@&4gcyFXF6Z%k2t~$>Io2p8dXR>P2Su--wjOqSnEP#I${4?(H
zPxJdN@IL`Re(vhXzq+HDrSKP3(b;KXtU9d&>5op`OUqNB4SGdP@}H7KYnO#X%h
zXjQBzcG?=@KUyuz?1tj^L?Dtx=E>7?IOsEYBm(wBFF>PG!$|dtN~gBZcBT8*;66kRb8b3<0S?GMHL}cZ7XIRLI>yWif43TpN?2DTOTsSo
zHT1&7zS>yGt~IoMVx%?`GO!As0bzs-cS$BQ-^k=jf-2dd)7VA1PvzUQ-R_rQ}I^
ztqY1fNpmMf4p3`3F65*vFJ=|^=Y65l2o`>^@S9K4_DQG-a@9$yoD>YK3(GU8KNo^W3>RMo8f_CNNTKf=oT#9#In);PsEVJ^yZi!oL;%-wIGR`K9n5IL#(sQwLc|=@08Q
zZm3_GiVH@(Wz^y07)kv-Rvq$>Qv&=KsSJ6hLV2hP>KUlbbQS6qs1w9z={0+xHc~
zI;d|bJxkxu4RsUS9neuWjLB^WnR{F>FqP@iX1Sv~G%ClK_=Y
zWl#!%H1ec>}uamYFSzEV|_I-X>yQeL+3Oqt1^rUut4^e!Ot1CrjqJmmop#zkLXFLP<#XZi^OcbHW5m7
zo_1%)q|bxwa$c4*v0nXc2r#KxuUI43!aum-w}*5ae0`|>@F-Z@Qy5~Yg28vbGM$9MKGR{S)5hDPIpKbhvt&}-HB
z0q?OUW!1()&6RU!sZ`dd(szqOWJ@W`ER<$@Av4wM={(T33KnR;z3)IgX2P|Lw)+WO>s)oA1m1
z!@dx6<%gdQBr#3gzMl7488jP4p

Re: [PATCH 13/14] migration: Remove old preempt code around state maintainance

2022-09-20 Thread Peter Xu

On Tue, Sep 20, 2022 at 06:52:27PM -0400, Peter Xu wrote:
> With the new code to send pages in rp-return thread, there's little help to
> keep lots of the old code on maintaining the preempt state in migration
> thread, because the new way should always be faster..
> 
> Then if we'll always send pages in the rp-return thread anyway, we don't
> need those logic to maintain preempt state anymore because now we serialize
> things using the mutex directly instead of using those fields.
> 
> It's very unfortunate to have those code for a short period, but that's
> still one intermediate step that we noticed the next bottleneck on the
> migration thread.  Now what we can do best is to drop unnecessary code as
> long as the new code is stable to reduce the burden.  It's actually a good
> thing because the new "sending page in rp-return thread" model is (IMHO)
> even cleaner and with better performance.
> 
> Remove the old code that was responsible for maintaining preempt states, at
> the meantime also remove x-postcopy-preempt-break-huge parameter because
> with concurrent sender threads we don't really need to break-huge anymore.
> 
> Signed-off-by: Peter Xu 
> ---
>  migration/migration.c |   2 -
>  migration/ram.c   | 258 +-
>  2 files changed, 3 insertions(+), 257 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index fae8fd378b..698fd94591 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -4399,8 +4399,6 @@ static Property migration_properties[] = {
>  DEFINE_PROP_SIZE("announce-step", MigrationState,
>parameters.announce_step,
>DEFAULT_MIGRATE_ANNOUNCE_STEP),
> -DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
> -  postcopy_preempt_break_huge, true),

Forgot to drop the variable altogether:

diff --git a/migration/migration.h b/migration/migration.h
index cdad8aceaa..ae4ffd3454 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -340,13 +340,6 @@ struct MigrationState {
 bool send_configuration;
 /* Whether we send section footer during migration */
 bool send_section_footer;
-/*
- * Whether we allow break sending huge pages when postcopy preempt is
- * enabled.  When disabled, we won't interrupt precopy within sending a
- * host huge page, which is the old behavior of vanilla postcopy.
- * NOTE: this parameter is ignored if postcopy preempt is not enabled.
- */
-bool postcopy_preempt_break_huge;
 
 /* Needed by postcopy-pause state */
 QemuSemaphore postcopy_pause_sem;

Will squash this in in next version.

-- 
Peter Xu

Re: [PATCH] qboot: update to latest submodule

2022-09-20 Thread Jason A. Donenfeld

On Wed, Sep 21, 2022 at 12:22 AM Paolo Bonzini  wrote:
> Yeah the mirroring from GitHub to (my personal fork on) Gitlab was failing 
> because git:// is not supported anymore. Changed to https:// and everybody is 
> happy.

Ahh, bingo.

> Btw I saw your other patches, will get to it tomorrow.

Super. Thanks, looking forward.

Jason

Re: [PATCH] qboot: update to latest submodule

2022-09-20 Thread Paolo Bonzini

Il mer 21 set 2022, 00:11 Jason A. Donenfeld  ha scritto:

> On Tue, Sep 20, 2022 at 11:57:09PM +0200, Paolo Bonzini wrote:
> > It should have been automatic, there's mirroring set up.
>
> Hm, something is weird. Gitlab says "This project is mirrored from
> https://gitlab.com/bonzini/qboot.git. Pull mirroring updated 28 minutes
> ago." yet the commit is much older than 28 minutes ago. Backend issue of
> sorts?
>

Yeah the mirroring from GitHub to (my personal fork on) Gitlab was failing
because git:// is not supported anymore. Changed to https:// and everybody
is happy.

Btw I saw your other patches, will get to it tomorrow.

Paolo

>

Re: [PATCH v4 for 7.2 00/22] virtio-gpio and various virtio cleanups

2022-09-20 Thread Alex Bennée



"Michael S. Tsirkin"  writes:

> On Tue, Sep 20, 2022 at 02:25:48PM -0400, Stefan Hajnoczi wrote:
>> On Tue, 20 Sept 2022 at 10:18, Alex Bennée  wrote:
>> >
>> >
>> > Stefan Hajnoczi  writes:
>> >
>> > > [[PGP Signed Part:Undecided]]
>> > > On Fri, Sep 16, 2022 at 07:51:40AM +0100, Alex Bennée wrote:
>> > >>
>> > >> Alex Bennée  writes:
>> > >>
>> > >> > Hi,
>> > >> >
>> > >> > This is an update to the previous series which fixes the last few
>> > >> > niggling CI failures I was seeing.
>> > >> >
>> > >> >Subject: [PATCH v3 for 7.2 00/21] virtio-gpio and various virtio 
>> > >> > cleanups
>> > >> >Date: Tue, 26 Jul 2022 20:21:29 +0100
>> > >> >Message-Id: <20220726192150.2435175-1-alex.ben...@linaro.org>
>> > >> >
>> > >> > The CI failures were tricky to track down because they didn't occur
>> > >> > locally but after patching to dump backtraces they all seem to involve
>> > >> > updates to virtio_set_status() as the machine was torn down. I think
>> > >> > patch that switches all users to use virtio_device_started() along
>> > >> > with consistent checking of vhost_dev->started stops this from
>> > >> > happening. The clean-up seems worthwhile in reducing boilerplate
>> > >> > anyway.
>> > >> >
>> > >> > The following patches still need review:
>> > >> >
>> > >> >   - tests/qtest: enable tests for virtio-gpio
>> > >> >   - tests/qtest: add a get_features op to vhost-user-test
>> > >> >   - tests/qtest: implement stub for VHOST_USER_GET_CONFIG
>> > >> >   - tests/qtest: add assert to catch bad features
>> > >> >   - tests/qtest: plain g_assert for VHOST_USER_F_PROTOCOL_FEATURES
>> > >> >   - tests/qtest: catch unhandled vhost-user messages
>> > >> >   - tests/qtest: use qos_printf instead of g_test_message
>> > >> >   - tests/qtest: pass stdout/stderr down to subtests
>> > >> >   - hw/virtio: move vhd->started check into helper and add FIXME
>> > >> >   - hw/virtio: move vm_running check to virtio_device_started
>> > >> >   - hw/virtio: add some vhost-user trace events
>> > >> >   - hw/virtio: log potentially buggy guest drivers
>> > >> >   - hw/virtio: fix some coding style issues
>> > >> >   - include/hw: document vhost_dev feature life-cycle
>> > >> >   - include/hw/virtio: more comment for VIRTIO_F_BAD_FEATURE
>> > >> >   - hw/virtio: fix vhost_user_read tracepoint
>> > >> >   - hw/virtio: handle un-configured shutdown in virtio-pci
>> > >> >   - hw/virtio: gracefully handle unset vhost_dev vdev
>> > >> >   - hw/virtio: incorporate backend features in features
>> > >> 
>> > >>
>> > >> Ping?
>> > >
>> > > Who are you pinging?
>> > >
>> > > Only qemu-devel is on To and there are a bunch of people on Cc.
>> >
>> > Well I guess MST is the maintainer for the sub-system but I was hoping
>> > other virtio contributors had some sort of view. The process of
>> > up-streaming a simple vhost-user stub has flushed out all sorts of
>> > stuff.
>> 
>> Okay, moving MST to To in case it helps get his attention.
>> 
>> Thanks,
>> Stefan
>
> thanks, it's in my queue, just pulling in backlog that built up
> during the forum.

Thanks, doing the same myself ;-)

-- 
Alex Bennée

Re: [PATCH 8/9] softmmu/physmem: Let SysBusState absorb memory region and address space singletons

2022-09-20 Thread Bernhard Beschow

Am 20. September 2022 08:50:01 UTC schrieb BALATON Zoltan :
>
>
>On Tue, 20 Sep 2022, Philippe Mathieu-Daudé via wrote:
>
>> On 20/9/22 01:17, Bernhard Beschow wrote:
>>> These singletons are actually properties of the system bus but so far it
>>> hasn't been modelled that way. Fix this to make this relationship very
>>> obvious.
>>> 
>>> The idea of the patch is to restrain futher proliferation of the use of
>>> get_system_memory() and get_system_io() which are "temprary interfaces"
>> 
>> "further", "temporary"
>> 
>>> "until a proper bus interface is available". This should now be the
>>> case.
>>> 
>>> Note that the new attributes are values rather than a pointers. This
>>> trades pointer dereferences for pointer arithmetic. The idea is to
>>> reduce cache misses - a rule of thumb says that every pointer
>>> dereference causes a cache miss while arithmetic is basically free.
>>> 
>>> Signed-off-by: Bernhard Beschow 
>>> ---
>>>   include/exec/address-spaces.h | 19 ---
>>>   include/hw/sysbus.h   |  6 +
>>>   softmmu/physmem.c | 46 ++-
>>>   3 files changed, 45 insertions(+), 26 deletions(-)
>>> 
>>> diff --git a/include/exec/address-spaces.h b/include/exec/address-spaces.h
>>> index d5c8cbd718..b31bd8dcf0 100644
>>> --- a/include/exec/address-spaces.h
>>> +++ b/include/exec/address-spaces.h
>>> @@ -23,17 +23,28 @@
>>> #ifndef CONFIG_USER_ONLY
>>>   -/* Get the root memory region.  This interface should only be used 
>>> temporarily
>>> - * until a proper bus interface is available.
>>> +/**
>>> + * Get the root memory region.  This is a legacy function, provided for
>>> + * compatibility. Prefer using SysBusState::system_memory directly.
>>>*/
>>>   MemoryRegion *get_system_memory(void);
>> 
>>> diff --git a/include/hw/sysbus.h b/include/hw/sysbus.h
>>> index 5bb3b88501..516e9091dc 100644
>>> --- a/include/hw/sysbus.h
>>> +++ b/include/hw/sysbus.h
>>> @@ -17,6 +17,12 @@ struct SysBusState {
>>>   /*< private >*/
>>>   BusState parent_obj;
>>>   /*< public >*/
>>> +
>>> +MemoryRegion system_memory;
>>> +MemoryRegion system_io;
>>> +
>>> +AddressSpace address_space_io;
>>> +AddressSpace address_space_memory;
>> 
>> Alternatively (renaming doc accordingly):
>> 
>>   struct {
>>   MemoryRegion mr;
>>   AddressSpace as;
>>   } io, memory;
>
>Do we really need that? Isn't mr just the same as as.root so it would be 
>enough to store as only? Or is caching mr and not going through as to get it 
>saves time in accessing these?

as.root is just a pointer. That's why we need mr as a value as well.

> Now we'll go through SysBusState anyway instead of accessing globals so is 
> there a performance impact?

Good question. Since both attributes are now next to each another I'd hope for 
an improvement ;-) That depends on on many things of course, such as if they 
are located in the same cache line. As written in the commit messages I tried 
to minimize pointer dereferences.

Best regards,
Bernhard
>
>Regards,
>BALATON Zoltan
>
>>>   };
>>> #define TYPE_SYS_BUS_DEVICE "sys-bus-device"
>>> diff --git a/softmmu/physmem.c b/softmmu/physmem.c
>>> index 0ac920d446..07e9a9171c 100644
>>> --- a/softmmu/physmem.c
>>> +++ b/softmmu/physmem.c
>>> @@ -86,12 +86,6 @@
>>>*/
>>>   RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list.blocks) };
>>>   -static MemoryRegion *system_memory;
>>> -static MemoryRegion *system_io;
>>> -
>>> -static AddressSpace address_space_io;
>>> -static AddressSpace address_space_memory;
>>> -
>>>   static MemoryRegion io_mem_unassigned;
>>> typedef struct PhysPageEntry PhysPageEntry;
>>> @@ -146,7 +140,7 @@ typedef struct subpage_t {
>>>   #define PHYS_SECTION_UNASSIGNED 0
>>> static void io_mem_init(void);
>>> -static void memory_map_init(void);
>>> +static void memory_map_init(SysBusState *sysbus);
>>>   static void tcg_log_global_after_sync(MemoryListener *listener);
>>>   static void tcg_commit(MemoryListener *listener);
>>>   @@ -2667,37 +2661,45 @@ static void tcg_commit(MemoryListener *listener)
>>>   tlb_flush(cpuas->cpu);
>>>   }
>>>   -static void memory_map_init(void)
>>> +static void memory_map_init(SysBusState *sysbus)
>>>   {
>> 
>> No need to pass a singleton by argument.
>> 
>>   assert(current_machine);
>> 
>> You can use get_system_memory() and get_system_io() in place :)
>> 
>> LGTM otherwise, great!
>> 
>>> -system_memory = g_malloc(sizeof(*system_memory));
>>> +MemoryRegion *system_memory = >system_memory;
>>> +MemoryRegion *system_io = >system_io;
>>> memory_region_init(system_memory, NULL, "system", UINT64_MAX);
>>> -address_space_init(_space_memory, system_memory, "memory");
>>> +address_space_init(>address_space_memory, system_memory, 
>>> "memory");
>>>   -system_io = g_malloc(sizeof(*system_io));
>>>   memory_region_init_io(system_io, NULL, _io_ops, NULL, "io",
>>>

Re: [PATCH] qboot: update to latest submodule

2022-09-20 Thread Jason A. Donenfeld

On Mon, Sep 19, 2022 at 04:35:54PM +0200, Jason A. Donenfeld wrote:
> FYI, that commit made it to:
> 
> https://github.com/bonzini/qboot
> 
> But wasn't pushed to:
> 
> https://github.com/qemu/qboot
> https://gitlab.com/qemu-project/qboot
> https://git.qemu.org/?p=qboot.git;a=summary
> 
> I have no idea what's canonical, except that the submodule in the qemu
> checkout seems to point to the gitlab instance.
> 

With my prior email being ignored, this played out exactly as I
predicted it would:

Fetching submodule roms/qboot
fatal: remote error: upload-pack: not our ref 
8ca302e86d685fa05b16e2b20243da319941
Errors during submodule fetch:
roms/qboot

Can somebody push https://github.com/bonzini/qboot to
https://gitlab.com/qemu-project/qboot please? It will only take a
second.

Thanks,
Jason

Re: [PATCH 0/9] Deprecate sysbus_get_default() and get_system_memory() et. al

2022-09-20 Thread Bernhard Beschow

Am 20. September 2022 15:36:26 UTC schrieb Mark Cave-Ayland 
:
>On 20/09/2022 10:55, Peter Maydell wrote:
>
>> On Tue, 20 Sept 2022 at 00:18, Bernhard Beschow  wrote:
>>> 
>>> In address-spaces.h it can be read that get_system_memory() and
>>> get_system_io() are temporary interfaces which "should only be used 
>>> temporarily
>>> until a proper bus interface is available". This statement certainly 
>>> extends to
>>> the address_space_memory and address_space_io singletons.
>> 
>> This is a long standing "we never really completed a cleanup"...
>> 
>>> This series attempts
>>> to stop further proliferation of their use by turning TYPE_SYSTEM_BUS into 
>>> an
>>> object-oriented, "proper bus interface" inspired by PCIBus.
>>> 
>>> While at it, also the main_system_bus singleton is turned into an attribute 
>>> of
>>> MachineState. Together, this resolves five singletons in total, making the
>>> ownership relations much more obvious which helps comprehension.
>> 
>> ...but I don't think this is the direction we want to go.
>> Overall the reason that the "system memory" and "system IO"
>> singletons are weird is that in theory they should not be necessary
>> at all -- board code should create devices and map them into an
>> entirely arbitrary MemoryRegion or set of MemoryRegions corresponding
>> to address space(s) for the CPU and for DMA-capable devices. But we
>> keep them around because
>>   (a) there is a ton of legacy code that assumes there's only one
>>   address space in the system and this is it
>>   (b) when modelling the kind of board where there really is only
>>   one address space, having the 'system memory' global makes
>>   the APIs for creating and connecting devices a lot simpler
>> 
>> Retaining the whole-system singleton but shoving it into MachineState
>> doesn't really change much, IMHO.
>> 
>> More generally, sysbus is rather weird because it isn't really a
>> bus. Every device in the system of TYPE_SYS_BUS_DEVICE is "on"
>> the unique TYPE_SYSTEM_BUS bus, but that doesn't mean they're
>> all in the same address space or that in real hardware they'd
>> all be on the same bus. sysbus has essentially degraded into a
>> hack for having devices get reset. I really really need to make
>> some time to have another look at reset handling. If we get that
>> right then I think it's probably possible to collapse the few
>> things TYPE_SYS_BUS_DEVICE does that TYPE_DEVICE does not down
>> into TYPE_DEVICE and get rid of sysbus altogether...
>
>Following on from one of the discussion points from Alex's KVM Forum BoF 
>session: I think longer term what we need to aim for is for QEMU machines to 
>define their own address spaces, and then bind those address spaces containing 
>memory-mapped devices to one or more CPUs.

Isn't that more or less impossible with singletons?

>
>Once this in place, as Peter notes above it just remains to solve the reset 
>problem and then it becomes possible to eliminate sysbus altogether as 
>everything else can already be managed by qdev/QOM.

Also see my reply to Peter.

Thanks,
Bernhard
>
>
>ATB,
>
>Mark.

Re: [PATCH 1/9] hw/riscv/sifive_e: Fix inheritance of SiFiveEState

2022-09-20 Thread Bernhard Beschow

Am 20. September 2022 11:36:47 UTC schrieb Markus Armbruster 
:
>Alistair Francis  writes:
>
>> On Tue, Sep 20, 2022 at 9:18 AM Bernhard Beschow  wrote:
>>>
>>> SiFiveEState inherits from SysBusDevice while it's TypeInfo claims it to
>>> inherit from TYPE_MACHINE. This is an inconsistency which can cause
>>> undefined behavior such as memory corruption.
>>>
>>> Change SiFiveEState to inherit from MachineState since it is registered
>>> as a machine.
>>>
>>> Signed-off-by: Bernhard Beschow 
>>
>> Reviewed-by: Alistair Francis 
>
>To the SiFive maintainers: since this is a bug fix, let's merge it right
>away.

I could repost this particular patch with the three new tags (incl. Fixes) if 
desired.

Best regards,
Bernhard
>

Re: [PATCH v2 28/39] hw/pci-host: pnv_phb{3, 4}: Fix heap out-of-bound access failure

2022-09-20 Thread Bin Meng

On Tue, Sep 20, 2022 at 11:40 PM Daniel Henrique Barboza
 wrote:
>
> Bin,
>
> Since I'll send a ppc pull request shortly, I'll queue up both this and patch 
> 27 via
> the ppc tree. These are good fixes that are independent of what happens with 
> the
> 'tests/qtest: Enable running qtest on Windows' series.
>

Thank you Daniel.

Regards,
Bin

[PATCH 12/14] migration: Send requested page directly in rp-return thread

2022-09-20 Thread Peter Xu

With all the facilities ready, send the requested page directly in the
rp-return thread rather than queuing it in the request queue, if and only
if postcopy preempt is enabled.  It can achieve so because it uses separate
channel for sending urgent pages.  The only shared data is bitmap and it's
protected by the bitmap_mutex.

Note that since we're moving the ownership of the urgent channel from the
migration thread to rp thread it also means the rp thread is responsible
for managing the qemufile, e.g. properly close it when pausing migration
happens.  For this, let migration_release_from_dst_file to cover shutdown
of the urgent channel too, renaming it as migration_release_dst_files() to
better show what it does.

Signed-off-by: Peter Xu 
---
 migration/migration.c |  35 +++--
 migration/ram.c   | 112 ++
 2 files changed, 131 insertions(+), 16 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 0eacc0c99b..fae8fd378b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2845,8 +2845,11 @@ static int migrate_handle_rp_resume_ack(MigrationState 
*s, uint32_t value)
 return 0;
 }
 
-/* Release ms->rp_state.from_dst_file in a safe way */
-static void migration_release_from_dst_file(MigrationState *ms)
+/*
+ * Release ms->rp_state.from_dst_file (and postcopy_qemufile_src if
+ * existed) in a safe way.
+ */
+static void migration_release_dst_files(MigrationState *ms)
 {
 QEMUFile *file;
 
@@ -2859,6 +2862,18 @@ static void 
migration_release_from_dst_file(MigrationState *ms)
 ms->rp_state.from_dst_file = NULL;
 }
 
+/*
+ * Do the same to postcopy fast path socket too if there is.  No
+ * locking needed because this qemufile should only be managed by
+ * return path thread.
+ */
+if (ms->postcopy_qemufile_src) {
+migration_ioc_unregister_yank_from_file(ms->postcopy_qemufile_src);
+qemu_file_shutdown(ms->postcopy_qemufile_src);
+qemu_fclose(ms->postcopy_qemufile_src);
+ms->postcopy_qemufile_src = NULL;
+}
+
 qemu_fclose(file);
 }
 
@@ -3003,7 +3018,7 @@ out:
  * Maybe there is something we can do: it looks like a
  * network down issue, and we pause for a recovery.
  */
-migration_release_from_dst_file(ms);
+migration_release_dst_files(ms);
 rp = NULL;
 if (postcopy_pause_return_path_thread(ms)) {
 /*
@@ -3021,7 +3036,7 @@ out:
 }
 
 trace_source_return_path_thread_end();
-migration_release_from_dst_file(ms);
+migration_release_dst_files(ms);
 rcu_unregister_thread();
 return NULL;
 }
@@ -3544,18 +3559,6 @@ static MigThrError postcopy_pause(MigrationState *s)
 qemu_file_shutdown(file);
 qemu_fclose(file);
 
-/*
- * Do the same to postcopy fast path socket too if there is.  No
- * locking needed because no racer as long as we do this before setting
- * status to paused.
- */
-if (s->postcopy_qemufile_src) {
-migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
-qemu_file_shutdown(s->postcopy_qemufile_src);
-qemu_fclose(s->postcopy_qemufile_src);
-s->postcopy_qemufile_src = NULL;
-}
-
 migrate_set_state(>state, s->state,
   MIGRATION_STATUS_POSTCOPY_PAUSED);
 
diff --git a/migration/ram.c b/migration/ram.c
index fdcb61a2c8..fd301d793c 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -539,6 +539,8 @@ static QemuThread *decompress_threads;
 static QemuMutex decomp_done_lock;
 static QemuCond decomp_done_cond;
 
+static int ram_save_host_page_urgent(PageSearchStatus *pss);
+
 static bool do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock 
*block,
  ram_addr_t offset, uint8_t *source_buf);
 
@@ -553,6 +555,16 @@ static void pss_init(PageSearchStatus *pss, RAMBlock *rb, 
ram_addr_t page)
 pss->complete_round = false;
 }
 
+/*
+ * Check whether two PSSs are actively sending the same page.  Return true
+ * if it is, false otherwise.
+ */
+static bool pss_overlap(PageSearchStatus *pss1, PageSearchStatus *pss2)
+{
+return pss1->host_page_sending && pss2->host_page_sending &&
+(pss1->host_page_start == pss2->host_page_start);
+}
+
 static void *do_data_compress(void *opaque)
 {
 CompressParam *param = opaque;
@@ -2253,6 +2265,57 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t 
start, ram_addr_t len)
 return -1;
 }
 
+/*
+ * When with postcopy preempt, we send back the page directly in the
+ * rp-return thread.
+ */
+if (postcopy_preempt_active()) {
+ram_addr_t page_start = start >> TARGET_PAGE_BITS;
+size_t page_size = qemu_ram_pagesize(ramblock);
+PageSearchStatus *pss = _state->pss[RAM_CHANNEL_POSTCOPY];
+int ret = 0;
+
+

Re: [PULL v2 0/9] loongarch-to-apply queue

2022-09-20 Thread Stefan Hajnoczi

Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/7.2 for any 
user-visible changes.


signature.asc
Description: PGP signature

[PULL 16/17] hw/pci-host: pnv_phb{3, 4}: Fix heap out-of-bound access failure

2022-09-20 Thread Daniel Henrique Barboza

From: Xuzhou Cheng 

pnv_phb3_root_bus_info and pnv_phb4_root_bus_info are missing the
instance_size initialization. This results in accessing out-of-bound
memory when setting 'chip-id' and 'phb-id', and eventually crashes
glib's malloc functionality with the following message:

  "qemu-system-ppc64: GLib: ../glib-2.72.3/glib/gmem.c:131: failed to allocate 
3232 bytes"

This issue was noticed only when running qtests with QEMU Windows
32-bit executable. Windows 64-bit, Linux 32/64-bit do not expose
this bug though.

Fixes: 9ae1329ee2fe ("ppc/pnv: Add models for POWER8 PHB3 PCIe Host bridge")
Fixes: 4f9924c4d4cf ("ppc/pnv: Add models for POWER9 PHB4 PCIe Host bridge")
Reviewed-by: Cédric Le Goater 
Signed-off-by: Xuzhou Cheng 
Signed-off-by: Bin Meng 
Message-Id: <20220920103159.1865256-29-bmeng...@gmail.com>
Signed-off-by: Daniel Henrique Barboza 
---
 hw/pci-host/pnv_phb3.c | 1 +
 hw/pci-host/pnv_phb4.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/hw/pci-host/pnv_phb3.c b/hw/pci-host/pnv_phb3.c
index af8575c007..9054c393a2 100644
--- a/hw/pci-host/pnv_phb3.c
+++ b/hw/pci-host/pnv_phb3.c
@@ -1169,6 +1169,7 @@ static void pnv_phb3_root_bus_class_init(ObjectClass 
*klass, void *data)
 static const TypeInfo pnv_phb3_root_bus_info = {
 .name = TYPE_PNV_PHB3_ROOT_BUS,
 .parent = TYPE_PCIE_BUS,
+.instance_size = sizeof(PnvPHB3RootBus),
 .class_init = pnv_phb3_root_bus_class_init,
 };
 
diff --git a/hw/pci-host/pnv_phb4.c b/hw/pci-host/pnv_phb4.c
index 824e1a73fb..ccbde841fc 100644
--- a/hw/pci-host/pnv_phb4.c
+++ b/hw/pci-host/pnv_phb4.c
@@ -1773,6 +1773,7 @@ static void pnv_phb4_root_bus_class_init(ObjectClass 
*klass, void *data)
 static const TypeInfo pnv_phb4_root_bus_info = {
 .name = TYPE_PNV_PHB4_ROOT_BUS,
 .parent = TYPE_PCIE_BUS,
+.instance_size = sizeof(PnvPHB4RootBus),
 .class_init = pnv_phb4_root_bus_class_init,
 };
 
-- 
2.37.3

[PATCH 13/14] migration: Remove old preempt code around state maintainance

2022-09-20 Thread Peter Xu

With the new code to send pages in rp-return thread, there's little help to
keep lots of the old code on maintaining the preempt state in migration
thread, because the new way should always be faster..

Then if we'll always send pages in the rp-return thread anyway, we don't
need those logic to maintain preempt state anymore because now we serialize
things using the mutex directly instead of using those fields.

It's very unfortunate to have those code for a short period, but that's
still one intermediate step that we noticed the next bottleneck on the
migration thread.  Now what we can do best is to drop unnecessary code as
long as the new code is stable to reduce the burden.  It's actually a good
thing because the new "sending page in rp-return thread" model is (IMHO)
even cleaner and with better performance.

Remove the old code that was responsible for maintaining preempt states, at
the meantime also remove x-postcopy-preempt-break-huge parameter because
with concurrent sender threads we don't really need to break-huge anymore.

Signed-off-by: Peter Xu 
---
 migration/migration.c |   2 -
 migration/ram.c   | 258 +-
 2 files changed, 3 insertions(+), 257 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index fae8fd378b..698fd94591 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -4399,8 +4399,6 @@ static Property migration_properties[] = {
 DEFINE_PROP_SIZE("announce-step", MigrationState,
   parameters.announce_step,
   DEFAULT_MIGRATE_ANNOUNCE_STEP),
-DEFINE_PROP_BOOL("x-postcopy-preempt-break-huge", MigrationState,
-  postcopy_preempt_break_huge, true),
 DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
 DEFINE_PROP_STRING("tls-hostname", MigrationState, 
parameters.tls_hostname),
 DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
diff --git a/migration/ram.c b/migration/ram.c
index fd301d793c..f42efe02fc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -343,20 +343,6 @@ struct RAMSrcPageRequest {
 QSIMPLEQ_ENTRY(RAMSrcPageRequest) next_req;
 };
 
-typedef struct {
-/*
- * Cached ramblock/offset values if preempted.  They're only meaningful if
- * preempted==true below.
- */
-RAMBlock *ram_block;
-unsigned long ram_page;
-/*
- * Whether a postcopy preemption just happened.  Will be reset after
- * precopy recovered to background migration.
- */
-bool preempted;
-} PostcopyPreemptState;
-
 /* State of RAM for migration */
 struct RAMState {
 /* QEMUFile used for this migration */
@@ -419,14 +405,6 @@ struct RAMState {
 /* Queue of outstanding page requests from the destination */
 QemuMutex src_page_req_mutex;
 QSIMPLEQ_HEAD(, RAMSrcPageRequest) src_page_requests;
-
-/* Postcopy preemption informations */
-PostcopyPreemptState postcopy_preempt_state;
-/*
- * Current channel we're using on src VM.  Only valid if postcopy-preempt
- * is enabled.
- */
-unsigned int postcopy_channel;
 };
 typedef struct RAMState RAMState;
 
@@ -434,11 +412,6 @@ static RAMState *ram_state;
 
 static NotifierWithReturnList precopy_notifier_list;
 
-static void postcopy_preempt_reset(RAMState *rs)
-{
-memset(>postcopy_preempt_state, 0, sizeof(PostcopyPreemptState));
-}
-
 /* Whether postcopy has queued requests? */
 static bool postcopy_has_request(RAMState *rs)
 {
@@ -544,9 +517,6 @@ static int ram_save_host_page_urgent(PageSearchStatus *pss);
 static bool do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock 
*block,
  ram_addr_t offset, uint8_t *source_buf);
 
-static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss,
- bool postcopy_requested);
-
 /* NOTE: page is the PFN not real ram_addr_t. */
 static void pss_init(PageSearchStatus *pss, RAMBlock *rb, ram_addr_t page)
 {
@@ -2062,55 +2032,6 @@ void ram_write_tracking_stop(void)
 }
 #endif /* defined(__linux__) */
 
-/*
- * Check whether two addr/offset of the ramblock falls onto the same host huge
- * page.  Returns true if so, false otherwise.
- */
-static bool offset_on_same_huge_page(RAMBlock *rb, uint64_t addr1,
- uint64_t addr2)
-{
-size_t page_size = qemu_ram_pagesize(rb);
-
-addr1 = ROUND_DOWN(addr1, page_size);
-addr2 = ROUND_DOWN(addr2, page_size);
-
-return addr1 == addr2;
-}
-
-/*
- * Whether a previous preempted precopy huge page contains current requested
- * page?  Returns true if so, false otherwise.
- *
- * This should really happen very rarely, because it means when we were sending
- * during background migration for postcopy we're sending exactly the page that
- * some vcpu got faulted on on dest node.  When it happens, we probably don't
- * need to do much but drop the request, because we know right after

[PATCH 11/14] migration: Move last_sent_block into PageSearchStatus

2022-09-20 Thread Peter Xu

Since we use PageSearchStatus to represent a channel, it makes perfect
sense to keep last_sent_block (aka, leverage RAM_SAVE_FLAG_CONTINUE) to be
per-channel rather than global because each channel can be sending
different pages on ramblocks.

Hence move it from RAMState into PageSearchStatus.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 71 -
 1 file changed, 41 insertions(+), 30 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index dbe11e1ace..fdcb61a2c8 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -89,6 +89,8 @@ XBZRLECacheStats xbzrle_counters;
 struct PageSearchStatus {
 /* The migration channel used for a specific host page */
 QEMUFile*pss_channel;
+/* Last block from where we have sent data */
+RAMBlock *last_sent_block;
 /* Current block being searched */
 RAMBlock*block;
 /* Current page to search from */
@@ -368,8 +370,6 @@ struct RAMState {
 int uffdio_fd;
 /* Last block that we have visited searching for dirty pages */
 RAMBlock *last_seen_block;
-/* Last block from where we have sent data */
-RAMBlock *last_sent_block;
 /* Last dirty target page we have sent */
 ram_addr_t last_page;
 /* last ram version we have seen */
@@ -677,16 +677,17 @@ exit:
  *
  * Returns the number of bytes written
  *
- * @f: QEMUFile where to send the data
+ * @pss: current PSS channel status
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  *  in the lower bits, it contains flags
  */
-static size_t save_page_header(RAMState *rs, QEMUFile *f,  RAMBlock *block,
+static size_t save_page_header(PageSearchStatus *pss, RAMBlock *block,
ram_addr_t offset)
 {
 size_t size, len;
-bool same_block = (block == rs->last_sent_block);
+bool same_block = (block == pss->last_sent_block);
+QEMUFile *f = pss->pss_channel;
 
 if (same_block) {
 offset |= RAM_SAVE_FLAG_CONTINUE;
@@ -699,7 +700,7 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f,  
RAMBlock *block,
 qemu_put_byte(f, len);
 qemu_put_buffer(f, (uint8_t *)block->idstr, len);
 size += 1 + len;
-rs->last_sent_block = block;
+pss->last_sent_block = block;
 }
 return size;
 }
@@ -783,17 +784,19 @@ static void xbzrle_cache_zero_page(RAMState *rs, 
ram_addr_t current_addr)
  *  -1 means that xbzrle would be longer than normal
  *
  * @rs: current RAM state
+ * @pss: current PSS channel
  * @current_data: pointer to the address of the page contents
  * @current_addr: addr of the page
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */
-static int save_xbzrle_page(RAMState *rs, QEMUFile *file,
+static int save_xbzrle_page(RAMState *rs, PageSearchStatus *pss,
 uint8_t **current_data, ram_addr_t current_addr,
 RAMBlock *block, ram_addr_t offset)
 {
 int encoded_len = 0, bytes_xbzrle;
 uint8_t *prev_cached_page;
+QEMUFile *file = pss->pss_channel;
 
 if (!cache_is_cached(XBZRLE.cache, current_addr,
  ram_counters.dirty_sync_count)) {
@@ -858,7 +861,7 @@ static int save_xbzrle_page(RAMState *rs, QEMUFile *file,
 }
 
 /* Send XBZRLE based compressed page */
-bytes_xbzrle = save_page_header(rs, file, block,
+bytes_xbzrle = save_page_header(pss, block,
 offset | RAM_SAVE_FLAG_XBZRLE);
 qemu_put_byte(file, ENCODING_FLAG_XBZRLE);
 qemu_put_be16(file, encoded_len);
@@ -1289,19 +1292,19 @@ static void ram_release_page(const char *rbname, 
uint64_t offset)
  * Returns the size of data written to the file, 0 means the page is not
  * a zero page
  *
- * @rs: current RAM state
- * @file: the file where the data is saved
+ * @pss: current PSS channel
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */
-static int save_zero_page_to_file(RAMState *rs, QEMUFile *file,
+static int save_zero_page_to_file(PageSearchStatus *pss,
   RAMBlock *block, ram_addr_t offset)
 {
 uint8_t *p = block->host + offset;
+QEMUFile *file = pss->pss_channel;
 int len = 0;
 
 if (buffer_is_zero(p, TARGET_PAGE_SIZE)) {
-len += save_page_header(rs, file, block, offset | RAM_SAVE_FLAG_ZERO);
+len += save_page_header(pss, block, offset | RAM_SAVE_FLAG_ZERO);
 qemu_put_byte(file, 0);
 len += 1;
 ram_release_page(block->idstr, offset);
@@ -1314,14 +1317,14 @@ static int save_zero_page_to_file(RAMState *rs, 
QEMUFile *file,
  *
  * Returns the number of pages written.
  *
- * @rs: current RAM state
+ * @pss: current PSS channel
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */

[PATCH 07/14] migration: Teach PSS about host page

2022-09-20 Thread Peter Xu

Migration code has a lot to do with host pages.  Teaching PSS core about
the idea of host page helps a lot and makes the code clean.  Meanwhile,
this prepares for the future changes that can leverage the new PSS helpers
that this patch introduces to send host page in another thread.

Three more fields are introduced for this:

  (1) host_page_sending: this is set to true when QEMU is sending a host
  page, false otherwise.

  (2) host_page_{start|end}: these point to the start/end of host page
  we're sending, and it's only valid when host_page_sending==true.

For example, when we look up the next dirty page on the ramblock, with
host_page_sending==true, we'll not try to look for anything beyond the
current host page boundary.  This can be slightly efficient than current
code because currently we'll set pss->page to next dirty bit (which can be
over current host page boundary) and reset it to host page boundary if we
found it goes beyond that.

With above, we can easily make migration_bitmap_find_dirty() self contained
by updating pss->page properly.  rs* parameter is removed because it's not
even used in old code.

When sending a host page, we should use the pss helpers like this:

  - pss_host_page_prepare(pss): called before sending host page
  - pss_within_range(pss): whether we're still working on the cur host page?
  - pss_host_page_finish(pss): called after sending a host page

Then we can use ram_save_target_page() to save one small page.

Currently ram_save_host_page() is still the only user. If there'll be
another function to send host page (e.g. in return path thread) in the
future, it should follow the same style.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 95 +++--
 1 file changed, 76 insertions(+), 19 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 5bd3d76bf0..3f720b6de2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -474,6 +474,11 @@ struct PageSearchStatus {
  * postcopy pages via postcopy preempt channel.
  */
 bool postcopy_target_channel;
+/* Whether we're sending a host page */
+bool  host_page_sending;
+/* The start/end of current host page.  Only valid if 
host_page_sending==true */
+unsigned long host_page_start;
+unsigned long host_page_end;
 };
 typedef struct PageSearchStatus PageSearchStatus;
 
@@ -851,26 +856,38 @@ static int save_xbzrle_page(RAMState *rs, uint8_t 
**current_data,
 }
 
 /**
- * migration_bitmap_find_dirty: find the next dirty page from start
+ * pss_find_next_dirty: find the next dirty page of current ramblock
  *
- * Returns the page offset within memory region of the start of a dirty page
+ * This function updates pss->page to point to the next dirty page index
+ * within the ramblock to migrate, or the end of ramblock when nothing
+ * found.  Note that when pss->host_page_sending==true it means we're
+ * during sending a host page, so we won't look for dirty page that is
+ * outside the host page boundary.
  *
- * @rs: current RAM state
- * @rb: RAMBlock where to search for dirty pages
- * @start: page where we start the search
+ * @pss: the current page search status
  */
-static inline
-unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
-  unsigned long start)
+static void pss_find_next_dirty(PageSearchStatus *pss)
 {
+RAMBlock *rb = pss->block;
 unsigned long size = rb->used_length >> TARGET_PAGE_BITS;
 unsigned long *bitmap = rb->bmap;
 
 if (ramblock_is_ignored(rb)) {
-return size;
+/* Points directly to the end, so we know no dirty page */
+pss->page = size;
+return;
+}
+
+/*
+ * If during sending a host page, only look for dirty pages within the
+ * current host page being send.
+ */
+if (pss->host_page_sending) {
+assert(pss->host_page_end);
+size = MIN(size, pss->host_page_end);
 }
 
-return find_next_bit(bitmap, size, start);
+pss->page = find_next_bit(bitmap, size, pss->page);
 }
 
 static void migration_clear_memory_region_dirty_bitmap(RAMBlock *rb,
@@ -1556,7 +1573,9 @@ static bool find_dirty_block(RAMState *rs, 
PageSearchStatus *pss, bool *again)
 pss->postcopy_requested = false;
 pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY;
 
-pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
+/* Update pss->page for the next dirty bit in ramblock */
+pss_find_next_dirty(pss);
+
 if (pss->complete_round && pss->block == rs->last_seen_block &&
 pss->page >= rs->last_page) {
 /*
@@ -2446,6 +2465,44 @@ static void postcopy_preempt_reset_channel(RAMState *rs)
 }
 }
 
+/* Should be called before sending a host page */
+static void pss_host_page_prepare(PageSearchStatus *pss)
+{
+/* How many guest pages are there in one host page? */
+size_t guest_pfns = qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;

Re: [PATCH 9/9] exec/address-spaces: Inline legacy functions

2022-09-20 Thread Bernhard Beschow

Am 20. September 2022 09:02:41 UTC schrieb BALATON Zoltan :
>
>
>On Tue, 20 Sep 2022, Philippe Mathieu-Daudé via wrote:
>
>> On 20/9/22 01:17, Bernhard Beschow wrote:
>>> The functions just access a global pointer and perform some pointer
>>> arithmetic on top. Allow the compiler to see through this by inlining.
>> 
>> I thought about this while reviewing the previous patch, ...
>> 
>>> Signed-off-by: Bernhard Beschow 
>>> ---
>>>   include/exec/address-spaces.h | 30 ++
>>>   softmmu/physmem.c | 28 
>>>   2 files changed, 26 insertions(+), 32 deletions(-)
>>> 
>>> diff --git a/include/exec/address-spaces.h b/include/exec/address-spaces.h
>>> index b31bd8dcf0..182af27cad 100644
>>> --- a/include/exec/address-spaces.h
>>> +++ b/include/exec/address-spaces.h
>>> @@ -23,29 +23,51 @@
>>> #ifndef CONFIG_USER_ONLY
>>>   +#include "hw/boards.h"
>> 
>> ... but I'm not a fan of including this header here. It is restricted to 
>> system emulation, but still... Let see what the others think.
>
>Had the same thought first if this would break user emulation but I don't know 
>how that works (and this include is withing !CONFIG_USER_ONLY). I've checked 
>in configure now and it seems that softmmu is enabled/disabled with system, 
>which reminded me to a previous conversation where I've suggested renaming 
>softmmu to sysemu as that better shows what it's really used for and maybe the 
>real softmmu part should be split from it but I don't remember the details. If 
>it still works with --enable-user --disable-system then maybe it's OK and only 
>confusing because of misnaming sysemu as softmmu.

I've compiled all architectures w/o any --{enable,disable}-{user,system} flags 
and I had compile errors only when putting the include outside the guard. So 
this in particular doesn't seem to be a problem.

Best regards,
Bernhard
>
>Reagrds,
>BALATON Zoltan
>
>>>   /**
>>>* Get the root memory region.  This is a legacy function, provided for
>>>* compatibility. Prefer using SysBusState::system_memory directly.
>>>*/
>>> -MemoryRegion *get_system_memory(void);
>>> +inline MemoryRegion *get_system_memory(void)
>>> +{
>>> +assert(current_machine);
>>> +
>>> +return _machine->main_system_bus.system_memory;
>>> +}
>>> /**
>>>* Get the root I/O port region.  This is a legacy function, provided for
>>>* compatibility. Prefer using SysBusState::system_io directly.
>>>*/
>>> -MemoryRegion *get_system_io(void);
>>> +inline MemoryRegion *get_system_io(void)
>>> +{
>>> +assert(current_machine);
>>> +
>>> +return _machine->main_system_bus.system_io;
>>> +}
>>> /**
>>>* Get the root memory address space.  This is a legacy function, 
>>> provided for
>>>* compatibility. Prefer using SysBusState::address_space_memory directly.
>>>*/
>>> -AddressSpace *get_address_space_memory(void);
>>> +inline AddressSpace *get_address_space_memory(void)
>>> +{
>>> +assert(current_machine);
>>> +
>>> +return _machine->main_system_bus.address_space_memory;
>>> +}
>>> /**
>>>* Get the root I/O port address space.  This is a legacy function, 
>>> provided
>>>* for compatibility. Prefer using SysBusState::address_space_io directly.
>>>*/
>>> -AddressSpace *get_address_space_io(void);
>>> +inline AddressSpace *get_address_space_io(void)
>>> +{
>>> +assert(current_machine);
>>> +
>>> +return _machine->main_system_bus.address_space_io;
>>> +}
>>> #endif
>>>   diff --git a/softmmu/physmem.c b/softmmu/physmem.c
>>> index 07e9a9171c..dce088f55c 100644
>>> --- a/softmmu/physmem.c
>>> +++ b/softmmu/physmem.c
>>> @@ -2674,34 +2674,6 @@ static void memory_map_init(SysBusState *sysbus)
>>>   address_space_init(>address_space_io, system_io, "I/O");
>>>   }
>>>   -MemoryRegion *get_system_memory(void)
>>> -{
>>> -assert(current_machine);
>>> -
>>> -return _machine->main_system_bus.system_memory;
>>> -}
>>> -
>>> -MemoryRegion *get_system_io(void)
>>> -{
>>> -assert(current_machine);
>>> -
>>> -return _machine->main_system_bus.system_io;
>>> -}
>>> -
>>> -AddressSpace *get_address_space_memory(void)
>>> -{
>>> -assert(current_machine);
>>> -
>>> -return _machine->main_system_bus.address_space_memory;
>>> -}
>>> -
>>> -AddressSpace *get_address_space_io(void)
>>> -{
>>> -assert(current_machine);
>>> -
>>> -return _machine->main_system_bus.address_space_io;
>>> -}
>>> -
>>>   static void invalidate_and_set_dirty(MemoryRegion *mr, hwaddr addr,
>>>hwaddr length)
>>>   {
>> 
>> 
>>

[PATCH 00/14] migration: Postcopy Preempt-Full

2022-09-20 Thread Peter Xu

Based-on: <20220920223800.47467-1-pet...@redhat.com>
  [PATCH 0/5] migration: Bug fixes (prepare for preempt-full)

Tree is here:
  https://github.com/xzpeter/qemu/tree/preempt-full

RFC:
  https://lore.kernel.org/qemu-devel/20220829165659.96046-1-pet...@redhat.com

This patchset is the v1 formal version of preempt-full series.  The RFC tag
is removed as more tests was done, and all items mentioned in the TODO
items previously in the RFC cover letters are implemented in this patchset.

A few patches are added. Most of the patches are still the same as the RFC
ones but may have some trivial touch ups here and there.  E.g., comment
touch-ups as Dave used to suggest on bitmap_mutex.  If to have a look on
the diff stat we'll see mostly "+" is the same as "-" this time though,
because with the rp-return thread change we can logically drop a lot of
complicated preempt logic previously maintained in migration thread:

  3 files changed, 371 insertions(+), 399 deletions(-)

Feel free to have a look at patch "migration: Remove old preempt code
around state maintainance" where we dropped a lot of old code on preempt
state maintainance of migration thread (but the major part of the preempt
old code still needs to be there, e.g. channel managements) along with the
break-huge parameter (as we never need to break-huge anymore.. because we
already run in parallel).

Comparing to the recently merged preempt mode I called it "preempt-full"
because it threadifies the postcopy channels so now urgent pages can be
fully handled separately outside of the ram save loop.

The existing preempt code has reduced ramdom page req latency over 10Gbps
network from ~12ms to ~500us which has already landed.

This preempt-full series can further reduces that ~500us to ~230us per my
initial test.  More to share below.

Note that no new capability is needed, IOW it's fully compatible with the
existing preempt mode.  So the naming is actually not important but just to
identify the difference on the binaries.

The logic of the series is simple: send urgent pages in rp-return thread
rather than migration thread.  It also means rp-thread will take over the
ownership of the newly created preempt channel.  It can slow down rp-return
thread on receiving page requests, but I don't really see a major issue
with it so far but only benefits.

For detailed performance numbers, please refer to the rfc cover letter.

Please have a look, thanks.

Peter Xu (14):
  migration: Add postcopy_preempt_active()
  migration: Cleanup xbzrle zero page cache update logic
  migration: Trivial cleanup save_page_header() on same block check
  migration: Remove RAMState.f references in compression code
  migration: Yield bitmap_mutex properly when sending/sleeping
  migration: Use atomic ops properly for page accountings
  migration: Teach PSS about host page
  migration: Introduce pss_channel
  migration: Add pss_init()
  migration: Make PageSearchStatus part of RAMState
  migration: Move last_sent_block into PageSearchStatus
  migration: Send requested page directly in rp-return thread
  migration: Remove old preempt code around state maintainance
  migration: Drop rs->f

 migration/migration.c |  47 +--
 migration/multifd.c   |   2 +-
 migration/ram.c   | 721 --
 3 files changed, 371 insertions(+), 399 deletions(-)

-- 
2.32.0

[PULL 15/17] hw/ppc: spapr: Use qemu_vfree() to free spapr->htab

2022-09-20 Thread Daniel Henrique Barboza

From: Xuzhou Cheng 

spapr->htab is allocated by qemu_memalign(), hence we should use
qemu_vfree() to free it.

Fixes: c5f54f3e31bf ("pseries: Move hash page table allocation to reset time")
Fixes: b4db54132ffe ("target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL"")
Signed-off-by: Xuzhou Cheng 
Signed-off-by: Bin Meng 
Reviewed-by: Daniel Henrique Barboza 
Reviewed-by: Marc-André Lureau 
Message-Id: <20220920103159.1865256-28-bmeng...@gmail.com>
Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/spapr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index fb790b61e4..cc1adc23fa 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1522,7 +1522,7 @@ int spapr_hpt_shift_for_ramsize(uint64_t ramsize)
 
 void spapr_free_hpt(SpaprMachineState *spapr)
 {
-g_free(spapr->htab);
+qemu_vfree(spapr->htab);
 spapr->htab = NULL;
 spapr->htab_shift = 0;
 close_htab_fd(spapr);
-- 
2.37.3

[PATCH 01/14] migration: Add postcopy_preempt_active()

2022-09-20 Thread Peter Xu

Add the helper to show that postcopy preempt enabled, meanwhile active.

Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Peter Xu 
---
 migration/ram.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 1d42414ecc..d8cf7cc901 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -162,6 +162,11 @@ out:
 return ret;
 }
 
+static bool postcopy_preempt_active(void)
+{
+return migrate_postcopy_preempt() && migration_in_postcopy();
+}
+
 bool ramblock_is_ignored(RAMBlock *block)
 {
 return !qemu_ram_is_migratable(block) ||
@@ -2434,7 +2439,7 @@ static void postcopy_preempt_choose_channel(RAMState *rs, 
PageSearchStatus *pss)
 /* We need to make sure rs->f always points to the default channel elsewhere */
 static void postcopy_preempt_reset_channel(RAMState *rs)
 {
-if (migrate_postcopy_preempt() && migration_in_postcopy()) {
+if (postcopy_preempt_active()) {
 rs->postcopy_channel = RAM_CHANNEL_PRECOPY;
 rs->f = migrate_get_current()->to_dst_file;
 trace_postcopy_preempt_reset_channel();
@@ -2472,7 +2477,7 @@ static int ram_save_host_page(RAMState *rs, 
PageSearchStatus *pss)
 return 0;
 }
 
-if (migrate_postcopy_preempt() && migration_in_postcopy()) {
+if (postcopy_preempt_active()) {
 postcopy_preempt_choose_channel(rs, pss);
 }
 
-- 
2.32.0

Re: [PATCH 2/9] exec/hwaddr.h: Add missing include

2022-09-20 Thread Bernhard Beschow

Am 20. September 2022 04:50:51 UTC schrieb "Philippe Mathieu-Daudé" 
:
>On 20/9/22 01:17, Bernhard Beschow wrote:
>> The next commit would not compile w/o the include directive.
>> 
>> Signed-off-by: Bernhard Beschow 
>> ---
>>   include/exec/hwaddr.h | 1 +
>>   1 file changed, 1 insertion(+)
>> 
>> diff --git a/include/exec/hwaddr.h b/include/exec/hwaddr.h
>> index 8f16d179a8..616255317c 100644
>> --- a/include/exec/hwaddr.h
>> +++ b/include/exec/hwaddr.h
>> @@ -3,6 +3,7 @@
>>   #ifndef HWADDR_H
>>   #define HWADDR_H
>>   +#include "qemu/osdep.h"
>
>NAck: This is an anti-pattern. "qemu/osdep.h" must not be included
>in .h, only in .c.
>
>Isn't including "hw/qdev-core.h" in "include/hw/boards.h" enough in
>the next patch?

Yes, this works just fine indeed! This patch could be dropped if in the next 
iteration, if any.

Thanks,
Bernhard

[PULL 08/17] target/ppc: Remove unused xer_* macros

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

The macros xer_ov, xer_ca, xer_ov32, and xer_ca32 are both unused and
hiding the usage of env. Remove them.

Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-3-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/cpu.h | 4 
 1 file changed, 4 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 602ea77914..7f73e2ac81 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1506,10 +1506,6 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define XER_CMP  8
 #define XER_BC   0
 #define xer_so  (env->so)
-#define xer_ov  (env->ov)
-#define xer_ca  (env->ca)
-#define xer_ov32  (env->ov)
-#define xer_ca32  (env->ca)
 #define xer_cmp ((env->xer >> XER_CMP) & 0xFF)
 #define xer_bc  ((env->xer >> XER_BC)  & 0x7F)
 
-- 
2.37.3

Re: [PATCH 0/9] Deprecate sysbus_get_default() and get_system_memory() et. al

2022-09-20 Thread Bernhard Beschow

Am 20. September 2022 09:55:37 UTC schrieb Peter Maydell 
:
>On Tue, 20 Sept 2022 at 00:18, Bernhard Beschow  wrote:
>>
>> In address-spaces.h it can be read that get_system_memory() and
>> get_system_io() are temporary interfaces which "should only be used 
>> temporarily
>> until a proper bus interface is available". This statement certainly extends 
>> to
>> the address_space_memory and address_space_io singletons.
>
>This is a long standing "we never really completed a cleanup"...
>
>> This series attempts
>> to stop further proliferation of their use by turning TYPE_SYSTEM_BUS into an
>> object-oriented, "proper bus interface" inspired by PCIBus.
>>
>> While at it, also the main_system_bus singleton is turned into an attribute 
>> of
>> MachineState. Together, this resolves five singletons in total, making the
>> ownership relations much more obvious which helps comprehension.
>
>...but I don't think this is the direction we want to go.
>Overall the reason that the "system memory" and "system IO"
>singletons are weird is that in theory they should not be necessary
>at all -- board code should create devices and map them into an
>entirely arbitrary MemoryRegion or set of MemoryRegions corresponding
>to address space(s) for the CPU and for DMA-capable devices.

My intention was to allow exactly that: By turning sytem memory and system IO 
into non-singletons, one could have many of them, thus allowing boards to 
create arbitrary mappings of memory and IO. Since QEMU currently assumes one 
set (memory and IO) of addresses, I for now instantiated the SysBus once in the 
machine class to preserve behavior.

>But we
>keep them around because
> (a) there is a ton of legacy code that assumes there's only one
> address space in the system and this is it
> (b) when modelling the kind of board where there really is only
> one address space, having the 'system memory' global makes
> the APIs for creating and connecting devices a lot simpler

Indeed, the APIs may look simpler. The issue I see here though is that devices 
may make assumptions about these globals which makes the code hard to change in 
the long run. If devices are given their dependencies by the framework, they 
must make less assumptions, putting the framework into control. This makes the 
code more homogenious and therefore easier to change.

>Retaining the whole-system singleton but shoving it into MachineState
>doesn't really change much, IMHO.
>
>More generally, sysbus is rather weird because it isn't really a
>bus. Every device in the system of TYPE_SYS_BUS_DEVICE is "on"
>the unique TYPE_SYSTEM_BUS bus, but that doesn't mean they're
>all in the same address space or that in real hardware they'd
>all be on the same bus.

Again, having multiple SysBuses may solve that issue.

>sysbus has essentially degraded into a
>hack for having devices get reset. I really really need to make
>some time to have another look at reset handling. If we get that
>right then I think it's probably possible to collapse the few
>things TYPE_SYS_BUS_DEVICE does that TYPE_DEVICE does not down
>into TYPE_DEVICE and get rid of sysbus altogether...

There are many SysBusDevices which directly access the globals I intended to 
deprecate. If those devices could be changed to use the SysBus equivalents 
instead, this would put the boards in control of memory mappings.

Best regards,
Bernhard

>
>thanks
>-- PMM

[PATCH 3/5] migration: Disallow xbzrle with postcopy

2022-09-20 Thread Peter Xu

It's not supported since the 1st day, as ram_load_postcopy does not handle
RAM_SAVE_FLAG_XBZRLE.  Mark it disabled explicitly.

Signed-off-by: Peter Xu 
---
 migration/migration.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index bb8bbddfe4..fb4066dfb4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1268,6 +1268,11 @@ static bool migrate_caps_check(bool *cap_list,
 error_setg(errp, "Postcopy is not compatible with ignore-shared");
 return false;
 }
+
+if (cap_list[MIGRATION_CAPABILITY_XBZRLE]) {
+error_setg(errp, "Postcopy is not compatible with xbzrle");
+return false;
+}
 }
 
 if (cap_list[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
-- 
2.32.0

[PATCH 09/14] migration: Add pss_init()

2022-09-20 Thread Peter Xu

Helper to init PSS structures.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 40ff5dc49f..b4b36ca59e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -535,6 +535,14 @@ static bool do_compress_ram_page(QEMUFile *f, z_stream 
*stream, RAMBlock *block,
 static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss,
  bool postcopy_requested);
 
+/* NOTE: page is the PFN not real ram_addr_t. */
+static void pss_init(PageSearchStatus *pss, RAMBlock *rb, ram_addr_t page)
+{
+pss->block = rb;
+pss->page = page;
+pss->complete_round = false;
+}
+
 static void *do_data_compress(void *opaque)
 {
 CompressParam *param = opaque;
@@ -2640,9 +2648,7 @@ static int ram_find_and_save_block(RAMState *rs)
 rs->last_page = 0;
 }
 
-pss.block = rs->last_seen_block;
-pss.page = rs->last_page;
-pss.complete_round = false;
+pss_init(, rs->last_seen_block, rs->last_page);
 
 do {
 again = true;
-- 
2.32.0

[PATCH 1/5] migration: Fix possible deadloop of ram save process

2022-09-20 Thread Peter Xu

When starting ram saving procedure (especially at the completion phase),
always set last_seen_block to non-NULL to make sure we can always correctly
detect the case where "we've migrated all the dirty pages".

Then we'll guarantee both last_seen_block and pss.block will be valid
always before the loop starts.

See the comment in the code for some details.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index dc1de9ddbc..1d42414ecc 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2546,14 +2546,22 @@ static int ram_find_and_save_block(RAMState *rs)
 return pages;
 }
 
+/*
+ * Always keep last_seen_block/last_page valid during this procedure,
+ * because find_dirty_block() relies on these values (e.g., we compare
+ * last_seen_block with pss.block to see whether we searched all the
+ * ramblocks) to detect the completion of migration.  Having NULL value
+ * of last_seen_block can conditionally cause below loop to run forever.
+ */
+if (!rs->last_seen_block) {
+rs->last_seen_block = QLIST_FIRST_RCU(_list.blocks);
+rs->last_page = 0;
+}
+
 pss.block = rs->last_seen_block;
 pss.page = rs->last_page;
 pss.complete_round = false;
 
-if (!pss.block) {
-pss.block = QLIST_FIRST_RCU(_list.blocks);
-}
-
 do {
 again = true;
 found = get_queued_page(rs, );
-- 
2.32.0

[PULL 11/17] target/ppc: Zero second doubleword for VSX madd instructions

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

In 205eb5a89e we updated most VSX instructions to zero the
second doubleword, as is requested by PowerISA since v3.1.
However, VSX_MADD helper was left behind unchanged, while it
is also affected and should be fixed as well.

This patch applies the fix for MADD instructions.

Fixes: 205eb5a89e ("target/ppc: Change VSX instructions behavior to fill with 
zeros")
Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-6-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/fpu_helper.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 32995179b5..f07330ffc1 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2167,7 +2167,7 @@ VSX_TSQRT(xvtsqrtsp, 4, float32, VsrW(i), -126, 23)
 void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, \
  ppc_vsr_t *s1, ppc_vsr_t *s2, ppc_vsr_t *s3) \
 { \
-ppc_vsr_t t = *xt;\
+ppc_vsr_t t = { };\
 int i;\
   \
 helper_reset_fpstatus(env);   \
-- 
2.37.3

[PATCH 2/5] migration: Fix race on qemu_file_shutdown()

2022-09-20 Thread Peter Xu

In qemu_file_shutdown(), there's a possible race if with current order of
operation.  There're two major things to do:

  (1) Do real shutdown() (e.g. shutdown() syscall on socket)
  (2) Update qemufile's last_error

We must do (2) before (1) otherwise there can be a race condition like:

  page receiver other thread
  - 
  qemu_get_buffer()
do shutdown()
returns 0 (buffer all zero)
(meanwhile we didn't check this retcode)
  try to detect IO error
last_error==NULL, IO okay
  install ALL-ZERO page
set last_error
  --> guest crash!

To fix this, we can also check retval of qemu_get_buffer(), but not all
APIs can be properly checked and ultimately we still need to go back to
qemu_file_get_error().  E.g. qemu_get_byte() doesn't return error.

Maybe some day a rework of qemufile API is really needed, but for now keep
using qemu_file_get_error() and fix it by not allowing that race condition
to happen.  Here shutdown() is indeed special because the last_error was
emulated.  For real -EIO errors it'll always be set when e.g. sendmsg()
error triggers so we won't miss those ones, only shutdown() is a bit tricky
here.

Cc: Daniel P. Berrange 
Signed-off-by: Peter Xu 
---
 migration/qemu-file.c | 27 ---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/migration/qemu-file.c b/migration/qemu-file.c
index 4f400c2e52..2d5f74ffc2 100644
--- a/migration/qemu-file.c
+++ b/migration/qemu-file.c
@@ -79,6 +79,30 @@ int qemu_file_shutdown(QEMUFile *f)
 int ret = 0;
 
 f->shutdown = true;
+
+/*
+ * We must set qemufile error before the real shutdown(), otherwise
+ * there can be a race window where we thought IO all went though
+ * (because last_error==NULL) but actually IO has already stopped.
+ *
+ * If without correct ordering, the race can happen like this:
+ *
+ *  page receiver other thread
+ *  - 
+ *  qemu_get_buffer()
+ *do shutdown()
+ *returns 0 (buffer all zero)
+ *(we didn't check this retcode)
+ *  try to detect IO error
+ *last_error==NULL, IO okay
+ *  install ALL-ZERO page
+ *set last_error
+ *  --> guest crash!
+ */
+if (!f->last_error) {
+qemu_file_set_error(f, -EIO);
+}
+
 if (!qio_channel_has_feature(f->ioc,
  QIO_CHANNEL_FEATURE_SHUTDOWN)) {
 return -ENOSYS;
@@ -88,9 +112,6 @@ int qemu_file_shutdown(QEMUFile *f)
 ret = -EIO;
 }
 
-if (!f->last_error) {
-qemu_file_set_error(f, -EIO);
-}
 return ret;
 }
 
-- 
2.32.0

[PULL 03/17] target/ppc: Implement hashstp and hashchkp

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

Implementation for instructions hashstp and hashchkp, the privileged
versions of hashst and hashchk, which were added in Power ISA 3.1B.

Signed-off-by: Víctor Colombo 
Reviewed-by: Lucas Mateus Castro 
Message-Id: <20220715205439.161110-4-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/excp_helper.c   | 2 ++
 target/ppc/helper.h| 2 ++
 target/ppc/insn32.decode   | 2 ++
 target/ppc/translate/fixedpoint-impl.c.inc | 2 ++
 4 files changed, 8 insertions(+)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 7a16991f3d..214acf5ac4 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -2253,6 +2253,8 @@ void helper_##op(CPUPPCState *env, target_ulong ea, 
target_ulong ra,  \
 
 HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true)
 HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false)
+HELPER_HASH(HASHSTP, env->spr[SPR_HASHPKEYR], true)
+HELPER_HASH(HASHCHKP, env->spr[SPR_HASHPKEYR], false)
 
 #if !defined(CONFIG_USER_ONLY)
 
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 5817af632b..122b2e9359 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -6,6 +6,8 @@ DEF_HELPER_FLAGS_4(td, TCG_CALL_NO_WG, void, env, tl, tl, i32)
 #endif
 DEF_HELPER_4(HASHST, void, env, tl, tl, tl)
 DEF_HELPER_4(HASHCHK, void, env, tl, tl, tl)
+DEF_HELPER_4(HASHSTP, void, env, tl, tl, tl)
+DEF_HELPER_4(HASHCHKP, void, env, tl, tl, tl)
 #if !defined(CONFIG_USER_ONLY)
 DEF_HELPER_2(store_msr, void, env, tl)
 DEF_HELPER_1(rfi, void, env)
diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 544514565c..da08960fca 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -330,6 +330,8 @@ PEXTD   01 . . . 001000 -   @X
 
 HASHST  01 . . . 1011010010 .   @X_DW
 HASHCHK 01 . . . 100010 .   @X_DW
+HASHSTP 01 . . . 1010010010 .   @X_DW
+HASHCHKP01 . . . 1010110010 .   @X_DW
 
 ## BCD Assist
 
diff --git a/target/ppc/translate/fixedpoint-impl.c.inc 
b/target/ppc/translate/fixedpoint-impl.c.inc
index 41c06de8a2..1ba56cbed5 100644
--- a/target/ppc/translate/fixedpoint-impl.c.inc
+++ b/target/ppc/translate/fixedpoint-impl.c.inc
@@ -572,3 +572,5 @@ static bool do_hash(DisasContext *ctx, arg_X *a, bool priv,
 
 TRANS(HASHST, do_hash, false, gen_helper_HASHST)
 TRANS(HASHCHK, do_hash, false, gen_helper_HASHCHK)
+TRANS(HASHSTP, do_hash, true, gen_helper_HASHSTP)
+TRANS(HASHCHKP, do_hash, true, gen_helper_HASHCHKP)
-- 
2.37.3

[PATCH 14/14] migration: Drop rs->f

2022-09-20 Thread Peter Xu

Now with rs->pss we can already cache channels in pss->pss_channels.  That
pss_channel contains more infromation than rs->f because it's per-channel.
So rs->f could be replaced by rss->pss[RAM_CHANNEL_PRECOPY].pss_channel,
while rs->f itself is a bit vague now.

Note that vanilla postcopy still send pages via pss[RAM_CHANNEL_PRECOPY],
that's slightly confusing but it reflects the reality.

Then, after the replacement we can safely drop rs->f.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 12 
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index f42efe02fc..03bf2324ab 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -345,8 +345,6 @@ struct RAMSrcPageRequest {
 
 /* State of RAM for migration */
 struct RAMState {
-/* QEMUFile used for this migration */
-QEMUFile *f;
 /*
  * PageSearchStatus structures for the channels when send pages.
  * Protected by the bitmap_mutex.
@@ -2555,8 +2553,6 @@ static int ram_find_and_save_block(RAMState *rs)
 }
 
 if (found) {
-/* Cache rs->f in pss_channel (TODO: remove rs->f) */
-pss->pss_channel = rs->f;
 pages = ram_save_host_page(rs, pss);
 }
 } while (!pages && again);
@@ -3112,7 +3108,7 @@ static void ram_state_resume_prepare(RAMState *rs, 
QEMUFile *out)
 ram_state_reset(rs);
 
 /* Update RAMState cache of output QEMUFile */
-rs->f = out;
+rs->pss[RAM_CHANNEL_PRECOPY].pss_channel = out;
 
 trace_ram_state_resume_prepare(pages);
 }
@@ -3203,7 +3199,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 return -1;
 }
 }
-(*rsp)->f = f;
+(*rsp)->pss[RAM_CHANNEL_PRECOPY].pss_channel = f;
 
 WITH_RCU_READ_LOCK_GUARD() {
 qemu_put_be64(f, ram_bytes_total_common(true) | 
RAM_SAVE_FLAG_MEM_SIZE);
@@ -3338,7 +3334,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 out:
 if (ret >= 0
 && migration_is_setup_or_active(migrate_get_current()->state)) {
-ret = multifd_send_sync_main(rs->f);
+ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
 if (ret < 0) {
 return ret;
 }
@@ -3406,7 +3402,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 return ret;
 }
 
-ret = multifd_send_sync_main(rs->f);
+ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
 if (ret < 0) {
 return ret;
 }
-- 
2.32.0

Re: [PATCH] qboot: update to latest submodule

2022-09-20 Thread Paolo Bonzini

It should have been automatic, there's mirroring set up.

Paolo

Il mar 20 set 2022, 23:00 Jason A. Donenfeld  ha scritto:

> On Mon, Sep 19, 2022 at 04:35:54PM +0200, Jason A. Donenfeld wrote:
> > FYI, that commit made it to:
> >
> > https://github.com/bonzini/qboot
> >
> > But wasn't pushed to:
> >
> > https://github.com/qemu/qboot
> > https://gitlab.com/qemu-project/qboot
> > https://git.qemu.org/?p=qboot.git;a=summary
> >
> > I have no idea what's canonical, except that the submodule in the qemu
> > checkout seems to point to the gitlab instance.
> >
>
> With my prior email being ignored, this played out exactly as I
> predicted it would:
>
> Fetching submodule roms/qboot
> fatal: remote error: upload-pack: not our ref
> 8ca302e86d685fa05b16e2b20243da319941
> Errors during submodule fetch:
> roms/qboot
>
> Can somebody push https://github.com/bonzini/qboot to
> https://gitlab.com/qemu-project/qboot please? It will only take a
> second.
>
> Thanks,
> Jason
>
>

[PULL 02/17] target/ppc: Implement hashst and hashchk

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

Implementation for instructions hashst and hashchk, which were added
in Power ISA 3.1B.

It was decided to implement the hash algorithm from ground up in this
patch exactly as described in Power ISA.

Signed-off-by: Víctor Colombo 
Reviewed-by: Lucas Mateus Castro 
Message-Id: <20220715205439.161110-3-victor.colo...@eldorado.org.br>
[danielhb: fix block comment in excp_helper.c]
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/excp_helper.c   | 81 ++
 target/ppc/helper.h|  2 +
 target/ppc/insn32.decode   |  8 +++
 target/ppc/translate.c |  5 ++
 target/ppc/translate/fixedpoint-impl.c.inc | 32 +
 5 files changed, 128 insertions(+)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 7550aafed6..7a16991f3d 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -2173,6 +2173,87 @@ void helper_td(CPUPPCState *env, target_ulong arg1, 
target_ulong arg2,
 #endif
 #endif
 
+static uint32_t helper_SIMON_LIKE_32_64(uint32_t x, uint64_t key, uint32_t 
lane)
+{
+const uint16_t c = 0xfffc;
+const uint64_t z0 = 0xfa2561cdf44ac398ULL;
+uint16_t z = 0, temp;
+uint16_t k[32], eff_k[32], xleft[33], xright[33], fxleft[32];
+
+for (int i = 3; i >= 0; i--) {
+k[i] = key & 0x;
+key >>= 16;
+}
+xleft[0] = x & 0x;
+xright[0] = (x >> 16) & 0x;
+
+for (int i = 0; i < 28; i++) {
+z = (z0 >> (63 - i)) & 1;
+temp = ror16(k[i + 3], 3) ^ k[i + 1];
+k[i + 4] = c ^ z ^ k[i] ^ temp ^ ror16(temp, 1);
+}
+
+for (int i = 0; i < 8; i++) {
+eff_k[4 * i + 0] = k[4 * i + ((0 + lane) % 4)];
+eff_k[4 * i + 1] = k[4 * i + ((1 + lane) % 4)];
+eff_k[4 * i + 2] = k[4 * i + ((2 + lane) % 4)];
+eff_k[4 * i + 3] = k[4 * i + ((3 + lane) % 4)];
+}
+
+for (int i = 0; i < 32; i++) {
+fxleft[i] = (rol16(xleft[i], 1) &
+rol16(xleft[i], 8)) ^ rol16(xleft[i], 2);
+xleft[i + 1] = xright[i] ^ fxleft[i] ^ eff_k[i];
+xright[i + 1] = xleft[i];
+}
+
+return (((uint32_t)xright[32]) << 16) | xleft[32];
+}
+
+static uint64_t hash_digest(uint64_t ra, uint64_t rb, uint64_t key)
+{
+uint64_t stage0_h = 0ULL, stage0_l = 0ULL;
+uint64_t stage1_h, stage1_l;
+
+for (int i = 0; i < 4; i++) {
+stage0_h |= ror64(rb & 0xff, 8 * (2 * i + 1));
+stage0_h |= ((ra >> 32) & 0xff) << (8 * 2 * i);
+stage0_l |= ror64((rb >> 32) & 0xff, 8 * (2 * i + 1));
+stage0_l |= (ra & 0xff) << (8 * 2 * i);
+rb >>= 8;
+ra >>= 8;
+}
+
+stage1_h = (uint64_t)helper_SIMON_LIKE_32_64(stage0_h >> 32, key, 0) << 32;
+stage1_h |= helper_SIMON_LIKE_32_64(stage0_h, key, 1);
+stage1_l = (uint64_t)helper_SIMON_LIKE_32_64(stage0_l >> 32, key, 2) << 32;
+stage1_l |= helper_SIMON_LIKE_32_64(stage0_l, key, 3);
+
+return stage1_h ^ stage1_l;
+}
+
+#include "qemu/guest-random.h"
+
+#define HELPER_HASH(op, key, store)   \
+void helper_##op(CPUPPCState *env, target_ulong ea, target_ulong ra,  \
+ target_ulong rb) \
+{ \
+uint64_t calculated_hash = hash_digest(ra, rb, key), loaded_hash; \
+  \
+if (store) {  \
+cpu_stq_data_ra(env, ea, calculated_hash, GETPC());   \
+} else {  \
+loaded_hash = cpu_ldq_data_ra(env, ea, GETPC());  \
+if (loaded_hash != calculated_hash) { \
+raise_exception_err_ra(env, POWERPC_EXCP_PROGRAM, \
+POWERPC_EXCP_TRAP, GETPC());  \
+} \
+} \
+}
+
+HELPER_HASH(HASHST, env->spr[SPR_HASHKEYR], true)
+HELPER_HASH(HASHCHK, env->spr[SPR_HASHKEYR], false)
+
 #if !defined(CONFIG_USER_ONLY)
 
 #ifdef CONFIG_TCG
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 159b352f6e..5817af632b 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -4,6 +4,8 @@ DEF_HELPER_FLAGS_4(tw, TCG_CALL_NO_WG, void, env, tl, tl, i32)
 #if defined(TARGET_PPC64)
 DEF_HELPER_FLAGS_4(td, TCG_CALL_NO_WG, void, env, tl, tl, i32)
 #endif
+DEF_HELPER_4(HASHST, void, env, tl, tl, tl)
+DEF_HELPER_4(HASHCHK, void, env, tl, tl, tl)
 #if !defined(CONFIG_USER_ONLY)
 DEF_HELPER_2(store_msr, void, env, tl)
 DEF_HELPER_1(rfi, void, env)
diff --git a/target/ppc/insn32.decode

[PATCH 08/14] migration: Introduce pss_channel

2022-09-20 Thread Peter Xu

Introduce pss_channel for PageSearchStatus, define it as "the migration
channel to be used to transfer this host page".

We used to have rs->f, which is a mirror to MigrationState.to_dst_file.

After postcopy preempt initial version, rs->f can be dynamically changed
depending on which channel we want to use.

But that later work still doesn't grant full concurrency of sending pages
in e.g. different threads, because rs->f can either be the PRECOPY channel
or POSTCOPY channel.  This needs to be per-thread too.

PageSearchStatus is actually a good piece of struct which we can leverage
if we want to have multiple threads sending pages.  Sending a single guest
page may not make sense, so we make the granule to be "host page", and in
the PSS structure we allow specify a QEMUFile* to migrate a specific host
page.  Then we open the possibility to specify different channels in
different threads with different PSS structures.

The PSS prefix can be slightly misleading here because e.g. for the
upcoming usage of postcopy channel/thread it's not "searching" (or,
scanning) at all but sending the explicit page that was requested.  However
since PSS existed for some years keep it as-is until someone complains.

This patch mostly (simply) replace rs->f with pss->pss_channel only. No
functional change intended for this patch yet.  But it does prepare to
finally drop rs->f, and make ram_save_guest_page() thread safe.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 70 +++--
 1 file changed, 38 insertions(+), 32 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 3f720b6de2..40ff5dc49f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -446,6 +446,8 @@ void dirty_sync_missed_zero_copy(void)
 
 /* used by the search for pages to send */
 struct PageSearchStatus {
+/* The migration channel used for a specific host page */
+QEMUFile*pss_channel;
 /* Current block being searched */
 RAMBlock*block;
 /* Current page to search from */
@@ -768,9 +770,9 @@ static void xbzrle_cache_zero_page(RAMState *rs, ram_addr_t 
current_addr)
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */
-static int save_xbzrle_page(RAMState *rs, uint8_t **current_data,
-ram_addr_t current_addr, RAMBlock *block,
-ram_addr_t offset)
+static int save_xbzrle_page(RAMState *rs, QEMUFile *file,
+uint8_t **current_data, ram_addr_t current_addr,
+RAMBlock *block, ram_addr_t offset)
 {
 int encoded_len = 0, bytes_xbzrle;
 uint8_t *prev_cached_page;
@@ -838,11 +840,11 @@ static int save_xbzrle_page(RAMState *rs, uint8_t 
**current_data,
 }
 
 /* Send XBZRLE based compressed page */
-bytes_xbzrle = save_page_header(rs, rs->f, block,
+bytes_xbzrle = save_page_header(rs, file, block,
 offset | RAM_SAVE_FLAG_XBZRLE);
-qemu_put_byte(rs->f, ENCODING_FLAG_XBZRLE);
-qemu_put_be16(rs->f, encoded_len);
-qemu_put_buffer(rs->f, XBZRLE.encoded_buf, encoded_len);
+qemu_put_byte(file, ENCODING_FLAG_XBZRLE);
+qemu_put_be16(file, encoded_len);
+qemu_put_buffer(file, XBZRLE.encoded_buf, encoded_len);
 bytes_xbzrle += encoded_len + 1 + 2;
 /*
  * Like compressed_size (please see update_compress_thread_counts),
@@ -1298,9 +1300,10 @@ static int save_zero_page_to_file(RAMState *rs, QEMUFile 
*file,
  * @block: block that contains the page we want to send
  * @offset: offset inside the block for the page
  */
-static int save_zero_page(RAMState *rs, RAMBlock *block, ram_addr_t offset)
+static int save_zero_page(RAMState *rs, QEMUFile *file, RAMBlock *block,
+  ram_addr_t offset)
 {
-int len = save_zero_page_to_file(rs, rs->f, block, offset);
+int len = save_zero_page_to_file(rs, file, block, offset);
 
 if (len) {
 qatomic_inc(_counters.duplicate);
@@ -1317,15 +1320,15 @@ static int save_zero_page(RAMState *rs, RAMBlock 
*block, ram_addr_t offset)
  *
  * Return true if the pages has been saved, otherwise false is returned.
  */
-static bool control_save_page(RAMState *rs, RAMBlock *block, ram_addr_t offset,
-  int *pages)
+static bool control_save_page(PageSearchStatus *pss, RAMBlock *block,
+  ram_addr_t offset, int *pages)
 {
 uint64_t bytes_xmit = 0;
 int ret;
 
 *pages = -1;
-ret = ram_control_save_page(rs->f, block->offset, offset, TARGET_PAGE_SIZE,
-_xmit);
+ret = ram_control_save_page(pss->pss_channel, block->offset, offset,
+TARGET_PAGE_SIZE, _xmit);
 if (ret == RAM_SAVE_CONTROL_NOT_SUPP) {
 return false;
 }
@@ -1359,17 +1362,17 @@ static bool control_save_page(RAMState *rs, RAMBlock 
*block, ram_addr_t offset,
  * @buf: the

RE: [PATCH v1 0/3] ui/gtk: Add a new parameter to assign connectors/monitors to Guests' windows

2022-09-20 Thread Kasireddy, Vivek

Hi Markus,

> Any overlap with Dongwon Kim's "[PATCH v5 0/2] handling guest multiple
> displays"?
[Kasireddy, Vivek] Yes, there is some overlap but as I mentioned in the cover 
letter,
this series is intended to replace Dongwon's series dealing with multiple 
displays.

> 
> Message-Id: <20220718233009.18780-1-dongwon@intel.com>
> https://lists.nongnu.org/archive/html/qemu-devel/2022-07/msg03212.html
[Kasireddy, Vivek] We felt that using monitor numbers for display/VC assignment
would be cumbersome for users. And, given that his series does not take into 
account
monitor unplug/hotplug events, it's effectiveness would be limited compared to
this one.

Thanks,
Vivek

[PATCH 03/14] migration: Trivial cleanup save_page_header() on same block check

2022-09-20 Thread Peter Xu

The 2nd check on RAM_SAVE_FLAG_CONTINUE is a bit redundant.  Use a boolean
to be clearer.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index fc59c052cf..62ff2c1469 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -661,14 +661,15 @@ static size_t save_page_header(RAMState *rs, QEMUFile *f, 
 RAMBlock *block,
ram_addr_t offset)
 {
 size_t size, len;
+bool same_block = (block == rs->last_sent_block);
 
-if (block == rs->last_sent_block) {
+if (same_block) {
 offset |= RAM_SAVE_FLAG_CONTINUE;
 }
 qemu_put_be64(f, offset);
 size = 8;
 
-if (!(offset & RAM_SAVE_FLAG_CONTINUE)) {
+if (!same_block) {
 len = strlen(block->idstr);
 qemu_put_byte(f, len);
 qemu_put_buffer(f, (uint8_t *)block->idstr, len);
-- 
2.32.0

Re: [Phishing Risk] [External] Re: [PATCH 0/3] Add a host power device

2022-09-20 Thread Philippe Mathieu-Daudé via


On 20/9/22 17:17, Zhang Jian wrote:

Hi Philippe,

Thanks for your reply.

On Tue, Sep 20, 2022 at 7:09 AM Philippe Mathieu-Daudé  wrote:


Hi Jian,

On 19/9/22 19:21, Jian Zhang wrote:

This patchset adds a host power device and added it into the g220a
mahcine. The BMC have a important is to control the power of the host,
usually it is nessary in a hardware platform.

The BMC(soc) usually had a output pin to control the power of the host,
and a input pin to get the power status of the host.

The host power device is a generic device to simulate the host power,
accept the power control command from the BMC and report the power
status.

Test on the g220a machine, the host power control command can be simply
work.

Jian Zhang (3):
hw/gpio/aspeed_gpio: Add gpios in/out init
hw/misc/host_power: Add a simple host power device
hw/arm/aspeed: g220a: Add host-power device


"power-good" is just a TYPE_LED object, but it doesn't seem you are
really interested in using it.


yeah, i'd like to just send an irq when the `switch` status changed.


You can do that using feeding the switch latch output to a 2-lines
TYPE_SPLIT_IRQ object, then wire 1 line to the SoC input, and the
other one to the TYPE_LED input.

[PATCH 02/14] migration: Cleanup xbzrle zero page cache update logic

2022-09-20 Thread Peter Xu

The major change is to replace "!save_page_use_compression()" with
"xbzrle_enabled" to make it clear.

Reasonings:

(1) When compression enabled, "!save_page_use_compression()" is exactly the
same as checking "xbzrle_enabled".

(2) When compression disabled, "!save_page_use_compression()" always return
true.  We used to try calling the xbzrle code, but after this change we
won't, and we shouldn't need to.

Since at it, drop the xbzrle_enabled check in xbzrle_cache_zero_page()
because with this change it's not needed anymore.

Signed-off-by: Peter Xu 
---
 migration/ram.c | 6 +-
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index d8cf7cc901..fc59c052cf 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -741,10 +741,6 @@ void mig_throttle_counter_reset(void)
  */
 static void xbzrle_cache_zero_page(RAMState *rs, ram_addr_t current_addr)
 {
-if (!rs->xbzrle_enabled) {
-return;
-}
-
 /* We don't care if this fails to allocate a new cache page
  * as long as it updated an old one */
 cache_insert(XBZRLE.cache, current_addr, XBZRLE.zero_target_page,
@@ -2301,7 +2297,7 @@ static int ram_save_target_page(RAMState *rs, 
PageSearchStatus *pss)
 /* Must let xbzrle know, otherwise a previous (now 0'd) cached
  * page would be stale
  */
-if (!save_page_use_compression(rs)) {
+if (rs->xbzrle_enabled) {
 XBZRLE_cache_lock();
 xbzrle_cache_zero_page(rs, block->offset + offset);
 XBZRLE_cache_unlock();
-- 
2.32.0

[PULL 17/17] hw/ppc/spapr: Fix code style problems reported by checkpatch

2022-09-20 Thread Daniel Henrique Barboza

From: Bernhard Beschow 

Reviewed-by: Daniel Henrique Barboza 
Signed-off-by: Bernhard Beschow 
Message-Id: <20220919231720.163121-5-shen...@gmail.com>
Signed-off-by: Daniel Henrique Barboza 
---
 include/hw/ppc/spapr.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 530d739b1d..04a95669ab 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -848,7 +848,8 @@ static inline uint64_t ppc64_phys_to_real(uint64_t addr)
 
 static inline uint32_t rtas_ld(target_ulong phys, int n)
 {
-return ldl_be_phys(_space_memory, ppc64_phys_to_real(phys + 4*n));
+return ldl_be_phys(_space_memory,
+   ppc64_phys_to_real(phys + 4 * n));
 }
 
 static inline uint64_t rtas_ldq(target_ulong phys, int n)
@@ -858,7 +859,7 @@ static inline uint64_t rtas_ldq(target_ulong phys, int n)
 
 static inline void rtas_st(target_ulong phys, int n, uint32_t val)
 {
-stl_be_phys(_space_memory, ppc64_phys_to_real(phys + 4*n), val);
+stl_be_phys(_space_memory, ppc64_phys_to_real(phys + 4 * n), val);
 }
 
 typedef void (*spapr_rtas_fn)(PowerPCCPU *cpu, SpaprMachineState *sm,
-- 
2.37.3

[PATCH 0/5] migration: Bug fixes (prepare for preempt-full)

2022-09-20 Thread Peter Xu

This patchset does bug fixes that I found when testing preempt-full.

Patch 1 should fix a possible deadloop when I hit when testing the
preempt-full code.  I didn't verify it because it's so hard to trigger, but
the logic should be explained in the patch.

Patch 2 fixes a race condition I can easily trigger with the latest
preempt-full code when running with recovery+tls test.  The bug hides quite
deep and took time to debug.  Fundamentally it's about qemufile API, I hope
someday we can have something better than that but still so far there's no
strong enough reason to rework the whole thing.

Patch 3-4 are two patches to disable either postcopy or preempt mode only
for xbzrle/compression.

Patch 5 is something nice to have to optimize the bitmap ops.

The last two patches are actually part of my preempt-full RFC series.

I picked these patches out explicitly from preempt-full series, because at
least patches 1-4 fix real bugs in current code base, so they should get
more focus.

Thanks,

Peter Xu (5):
  migration: Fix possible deadloop of ram save process
  migration: Fix race on qemu_file_shutdown()
  migration: Disallow xbzrle with postcopy
  migration: Disallow postcopy preempt to be used with compress
  migration: Use non-atomic ops for clear log bitmap

 include/exec/ram_addr.h | 11 +-
 include/exec/ramblock.h |  3 +++
 include/qemu/bitmap.h   |  1 +
 migration/migration.c   | 16 +++
 migration/qemu-file.c   | 27 ++---
 migration/ram.c | 16 +++
 util/bitmap.c   | 45 +
 7 files changed, 107 insertions(+), 12 deletions(-)

-- 
2.32.0

Re: [PULL v2 0/9] loongarch-to-apply queue

2022-09-20 Thread Stefan Hajnoczi

Please remember to push your GPG key to the keyservers using gpg
send-keys YOUR_KEY_ID.

Thanks,
Stefan

[PATCH 4/5] migration: Disallow postcopy preempt to be used with compress

2022-09-20 Thread Peter Xu

The preempt mode requires the capability to assign channel for each of the
page, while the compression logic will currently assign pages to different
compress thread/local-channel so potentially they're incompatible.

Signed-off-by: Peter Xu 
---
 migration/migration.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index fb4066dfb4..07c74a79a2 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1341,6 +1341,17 @@ static bool migrate_caps_check(bool *cap_list,
 error_setg(errp, "Postcopy preempt requires postcopy-ram");
 return false;
 }
+
+/*
+ * Preempt mode requires urgent pages to be sent in separate
+ * channel, OTOH compression logic will disorder all pages into
+ * different compression channels, which is not compatible with the
+ * preempt assumptions on channel assignments.
+ */
+if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
+error_setg(errp, "Postcopy preempt not compatible with compress");
+return false;
+}
 }
 
 return true;
-- 
2.32.0

[PULL 10/17] target/ppc: Set result to QNaN for DENBCD when VXCVI occurs

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

According to the ISA, for instruction DENBCD:
"If an invalid BCD digit or sign code is detected in the source
operand, an invalid-operation exception (VXCVI) occurs."

In the Invalid Operation Exception section, there is the situation:
"When Invalid Operation Exception is disabled (VE=0) and Invalid
Operation occurs (...) If the operation is an (...) or format the
target FPR is set to a Quiet NaN". This was not being done in
QEMU.

This patch sets the result to QNaN when the instruction DENBCD causes
an Invalid Operation Exception.

Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-5-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/dfp_helper.c | 26 --
 1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/target/ppc/dfp_helper.c b/target/ppc/dfp_helper.c
index be7aa5357a..cc024316d5 100644
--- a/target/ppc/dfp_helper.c
+++ b/target/ppc/dfp_helper.c
@@ -1147,6 +1147,26 @@ static inline uint8_t dfp_get_bcd_digit_128(ppc_vsr_t 
*t, unsigned n)
 return t->VsrD((n & 0x10) ? 0 : 1) >> ((n << 2) & 63) & 15;
 }
 
+static inline void dfp_invalid_op_vxcvi_64(struct PPC_DFP *dfp)
+{
+/* TODO: fpscr is incorrectly not being saved to env */
+dfp_set_FPSCR_flag(dfp, FP_VX | FP_VXCVI, FPSCR_VE);
+if ((dfp->env->fpscr & FP_VE) == 0) {
+dfp->vt.VsrD(1) = 0x7c00; /* QNaN */
+}
+}
+
+
+static inline void dfp_invalid_op_vxcvi_128(struct PPC_DFP *dfp)
+{
+/* TODO: fpscr is incorrectly not being saved to env */
+dfp_set_FPSCR_flag(dfp, FP_VX | FP_VXCVI, FPSCR_VE);
+if ((dfp->env->fpscr & FP_VE) == 0) {
+dfp->vt.VsrD(0) = 0x7c00; /* QNaN */
+dfp->vt.VsrD(1) = 0x0;
+}
+}
+
 #define DFP_HELPER_ENBCD(op, size)   \
 void helper_##op(CPUPPCState *env, ppc_fprp_t *t, ppc_fprp_t *b, \
  uint32_t s) \
@@ -1173,7 +1193,8 @@ void helper_##op(CPUPPCState *env, ppc_fprp_t *t, 
ppc_fprp_t *b, \
 sgn = 0; \
 break;   \
 default: \
-dfp_set_FPSCR_flag(, FP_VX | FP_VXCVI, FPSCR_VE);\
+dfp_invalid_op_vxcvi_##size();   \
+set_dfp##size(t, );   \
 return;  \
 }\
 }\
@@ -1183,7 +1204,8 @@ void helper_##op(CPUPPCState *env, ppc_fprp_t *t, 
ppc_fprp_t *b, \
 digits[(size) / 4 - n] = dfp_get_bcd_digit_##size(,   \
   offset++); \
 if (digits[(size) / 4 - n] > 10) {   \
-dfp_set_FPSCR_flag(, FP_VX | FP_VXCVI, FPSCR_VE);\
+dfp_invalid_op_vxcvi_##size();   \
+set_dfp##size(t, );   \
 return;  \
 } else { \
 nonzero |= (digits[(size) / 4 - n] > 0); \
-- 
2.37.3

[PULL 13/17] target/ppc: Zero second doubleword of VSR registers for FPR insns

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

FPR register are mapped to the first doubleword of the VSR registers.
Since PowerISA v3.1, the second doubleword of the target register
must be zeroed for FP instructions.

This patch does it by writting 0 to the second dw everytime the
first dw is being written using set_fpr.

Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-8-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/translate.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 29939bd923..e810842925 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -6443,6 +6443,14 @@ static inline void get_fpr(TCGv_i64 dst, int regno)
 static inline void set_fpr(int regno, TCGv_i64 src)
 {
 tcg_gen_st_i64(src, cpu_env, fpr_offset(regno));
+/*
+ * Before PowerISA v3.1 the result of doubleword 1 of the VSR
+ * corresponding to the target FPR was undefined. However,
+ * most (if not all) real hardware were setting the result to 0.
+ * Starting at ISA v3.1, the result for doubleword 1 is now defined
+ * to be 0.
+ */
+tcg_gen_st_i64(tcg_constant_i64(0), cpu_env, vsr64_offset(regno, false));
 }
 
 static inline void get_avr64(TCGv_i64 dst, int regno, bool high)
-- 
2.37.3

[PULL 12/17] target/ppc: Set OV32 when OV is set

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

According to PowerISA: "OV32 is set whenever OV is implicitly set, and
is set to the same value that OV is defined to be set to in 32-bit
mode".

This patch changes helper_update_ov_legacy to set/clear ov32 when
applicable.

Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-7-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/int_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c
index d905f07d02..696096100b 100644
--- a/target/ppc/int_helper.c
+++ b/target/ppc/int_helper.c
@@ -37,9 +37,9 @@
 static inline void helper_update_ov_legacy(CPUPPCState *env, int ov)
 {
 if (unlikely(ov)) {
-env->so = env->ov = 1;
+env->so = env->ov = env->ov32 = 1;
 } else {
-env->ov = 0;
+env->ov = env->ov32 = 0;
 }
 }
 
-- 
2.37.3

[PULL 09/17] target/ppc: Zero second doubleword in DFP instructions

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

Starting at PowerISA v3.1, the second doubleword of the registers
used to store results in DFP instructions are supposed to be zeroed.

>From the ISA, chapter 7.2.1.1 Floating-Point Registers:
"""
Chapter 4. Floating-Point Facility provides 32 64-bit
FPRs. Chapter 5. Decimal Floating-Point also employs
FPRs in decimal floating-point (DFP) operations. When
VSX is implemented, the 32 FPRs are mapped to
doubleword 0 of VSRs 0-31. (...)
All instructions that operate on an FPR are redefined
to operate on doubleword element 0 of the
corresponding VSR. (...)
and the contents of doubleword element 1 of the
VSR corresponding to the target FPR or FPR pair for these
instructions are set to 0.
"""

Before, the result stored at doubleword 1 was said to be undefined.

With that, this patch changes the DFP facility to zero doubleword 1
when using set_dfp64 and set_dfp128. This fixes the behavior for ISA
3.1 while keeping the behavior correct for previous ones.

Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-4-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/dfp_helper.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/target/ppc/dfp_helper.c b/target/ppc/dfp_helper.c
index 5ba74b2124..be7aa5357a 100644
--- a/target/ppc/dfp_helper.c
+++ b/target/ppc/dfp_helper.c
@@ -42,13 +42,16 @@ static void get_dfp128(ppc_vsr_t *dst, ppc_fprp_t *dfp)
 
 static void set_dfp64(ppc_fprp_t *dfp, ppc_vsr_t *src)
 {
-dfp->VsrD(0) = src->VsrD(1);
+dfp[0].VsrD(0) = src->VsrD(1);
+dfp[0].VsrD(1) = 0ULL;
 }
 
 static void set_dfp128(ppc_fprp_t *dfp, ppc_vsr_t *src)
 {
 dfp[0].VsrD(0) = src->VsrD(0);
 dfp[1].VsrD(0) = src->VsrD(1);
+dfp[0].VsrD(1) = 0ULL;
+dfp[1].VsrD(1) = 0ULL;
 }
 
 static void set_dfp128_to_avr(ppc_avr_t *dst, ppc_vsr_t *src)
-- 
2.37.3

[PATCH 5/5] migration: Use non-atomic ops for clear log bitmap

2022-09-20 Thread Peter Xu

Since we already have bitmap_mutex to protect either the dirty bitmap or
the clear log bitmap, we don't need atomic operations to set/clear/test on
the clear log bitmap.  Switching all ops from atomic to non-atomic
versions, meanwhile touch up the comments to show which lock is in charge.

Introduced non-atomic version of bitmap_test_and_clear_atomic(), mostly the
same as the atomic version but simplified a few places, e.g. dropped the
"old_bits" variable, and also the explicit memory barriers.

Reviewed-by: Dr. David Alan Gilbert 
Signed-off-by: Peter Xu 
---
 include/exec/ram_addr.h | 11 +-
 include/exec/ramblock.h |  3 +++
 include/qemu/bitmap.h   |  1 +
 util/bitmap.c   | 45 +
 4 files changed, 55 insertions(+), 5 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index f3e0c78161..5092a2e0ff 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -42,7 +42,8 @@ static inline long clear_bmap_size(uint64_t pages, uint8_t 
shift)
 }
 
 /**
- * clear_bmap_set: set clear bitmap for the page range
+ * clear_bmap_set: set clear bitmap for the page range.  Must be with
+ * bitmap_mutex held.
  *
  * @rb: the ramblock to operate on
  * @start: the start page number
@@ -55,12 +56,12 @@ static inline void clear_bmap_set(RAMBlock *rb, uint64_t 
start,
 {
 uint8_t shift = rb->clear_bmap_shift;
 
-bitmap_set_atomic(rb->clear_bmap, start >> shift,
-  clear_bmap_size(npages, shift));
+bitmap_set(rb->clear_bmap, start >> shift, clear_bmap_size(npages, shift));
 }
 
 /**
- * clear_bmap_test_and_clear: test clear bitmap for the page, clear if set
+ * clear_bmap_test_and_clear: test clear bitmap for the page, clear if set.
+ * Must be with bitmap_mutex held.
  *
  * @rb: the ramblock to operate on
  * @page: the page number to check
@@ -71,7 +72,7 @@ static inline bool clear_bmap_test_and_clear(RAMBlock *rb, 
uint64_t page)
 {
 uint8_t shift = rb->clear_bmap_shift;
 
-return bitmap_test_and_clear_atomic(rb->clear_bmap, page >> shift, 1);
+return bitmap_test_and_clear(rb->clear_bmap, page >> shift, 1);
 }
 
 static inline bool offset_in_ramblock(RAMBlock *b, ram_addr_t offset)
diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 6cbedf9e0c..adc03df59c 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -53,6 +53,9 @@ struct RAMBlock {
  * and split clearing of dirty bitmap on the remote node (e.g.,
  * KVM).  The bitmap will be set only when doing global sync.
  *
+ * It is only used during src side of ram migration, and it is
+ * protected by the global ram_state.bitmap_mutex.
+ *
  * NOTE: this bitmap is different comparing to the other bitmaps
  * in that one bit can represent multiple guest pages (which is
  * decided by the `clear_bmap_shift' variable below).  On
diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 82a1d2f41f..3ccb00865f 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -253,6 +253,7 @@ void bitmap_set(unsigned long *map, long i, long len);
 void bitmap_set_atomic(unsigned long *map, long i, long len);
 void bitmap_clear(unsigned long *map, long start, long nr);
 bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr);
+bool bitmap_test_and_clear(unsigned long *map, long start, long nr);
 void bitmap_copy_and_clear_atomic(unsigned long *dst, unsigned long *src,
   long nr);
 unsigned long bitmap_find_next_zero_area(unsigned long *map,
diff --git a/util/bitmap.c b/util/bitmap.c
index f81d8057a7..8d12e90a5a 100644
--- a/util/bitmap.c
+++ b/util/bitmap.c
@@ -240,6 +240,51 @@ void bitmap_clear(unsigned long *map, long start, long nr)
 }
 }
 
+bool bitmap_test_and_clear(unsigned long *map, long start, long nr)
+{
+unsigned long *p = map + BIT_WORD(start);
+const long size = start + nr;
+int bits_to_clear = BITS_PER_LONG - (start % BITS_PER_LONG);
+unsigned long mask_to_clear = BITMAP_FIRST_WORD_MASK(start);
+bool dirty = false;
+
+assert(start >= 0 && nr >= 0);
+
+/* First word */
+if (nr - bits_to_clear > 0) {
+if ((*p) & mask_to_clear) {
+dirty = true;
+}
+*p &= ~mask_to_clear;
+nr -= bits_to_clear;
+bits_to_clear = BITS_PER_LONG;
+p++;
+}
+
+/* Full words */
+if (bits_to_clear == BITS_PER_LONG) {
+while (nr >= BITS_PER_LONG) {
+if (*p) {
+dirty = true;
+*p = 0;
+}
+nr -= BITS_PER_LONG;
+p++;
+}
+}
+
+/* Last word */
+if (nr) {
+mask_to_clear &= BITMAP_LAST_WORD_MASK(size);
+if ((*p) & mask_to_clear) {
+dirty = true;
+}
+*p &= ~mask_to_clear;
+}
+
+return dirty;
+}
+
 bool bitmap_test_and_clear_atomic(unsigned long *map, long start, long nr)
 {

Re: [PATCH] checkpatch: ignore target/hexagon/imported/* files

2022-09-20 Thread Philippe Mathieu-Daudé via


On 20/9/22 15:42, Matheus Tavares Bernardino wrote:

These files come from an external project (the hexagon archlib), so they
deliberately do not follow QEMU's coding style. To avoid false positives
from checkpatch.pl, let's disable the checking for those.

Signed-off-by: Matheus Tavares Bernardino 
---
  scripts/checkpatch.pl | 1 +
  1 file changed, 1 insertion(+)


Reviewed-by: Philippe Mathieu-Daudé

[PULL 05/17] target/ppc: Move fsqrts to decodetree

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

Signed-off-by: Víctor Colombo 
Reviewed-by: Richard Henderson 
Message-Id: <20220905123746.54659-3-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/insn32.decode   |  1 +
 target/ppc/translate/fp-impl.c.inc | 23 +--
 target/ppc/translate/fp-ops.c.inc  |  1 -
 3 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index 33aa27bd4f..a5249ee32c 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -369,6 +369,7 @@ STFDUX  01 . ..  100111 -   @X
 ### Floating-Point Arithmetic Instructions
 
 FSQRT   11 . - . - 10110 .  @A_tb
+FSQRTS  111011 . - . - 10110 .  @A_tb
 
 ### Floating-Point Select Instruction
 
diff --git a/target/ppc/translate/fp-impl.c.inc 
b/target/ppc/translate/fp-impl.c.inc
index e8359af005..7a90c0e350 100644
--- a/target/ppc/translate/fp-impl.c.inc
+++ b/target/ppc/translate/fp-impl.c.inc
@@ -281,28 +281,7 @@ static bool do_helper_fsqrt(DisasContext *ctx, arg_A_tb *a,
 }
 
 TRANS(FSQRT, do_helper_fsqrt, gen_helper_fsqrt);
-
-static void gen_fsqrts(DisasContext *ctx)
-{
-TCGv_i64 t0;
-TCGv_i64 t1;
-if (unlikely(!ctx->fpu_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_FPU);
-return;
-}
-t0 = tcg_temp_new_i64();
-t1 = tcg_temp_new_i64();
-gen_reset_fpstatus();
-get_fpr(t0, rB(ctx->opcode));
-gen_helper_fsqrts(t1, cpu_env, t0);
-set_fpr(rD(ctx->opcode), t1);
-gen_compute_fprf_float64(t1);
-if (unlikely(Rc(ctx->opcode) != 0)) {
-gen_set_cr1_from_fpscr(ctx);
-}
-tcg_temp_free_i64(t0);
-tcg_temp_free_i64(t1);
-}
+TRANS(FSQRTS, do_helper_fsqrt, gen_helper_fsqrts);
 
 /*** Floating-Point multiply-and-add   ***/
 /* fmadd - fmadds */
diff --git a/target/ppc/translate/fp-ops.c.inc 
b/target/ppc/translate/fp-ops.c.inc
index 38759f5939..d4c6c4bed1 100644
--- a/target/ppc/translate/fp-ops.c.inc
+++ b/target/ppc/translate/fp-ops.c.inc
@@ -62,7 +62,6 @@ GEN_HANDLER_E(stfdepx, 0x1F, 0x1F, 0x16, 0x0001, 
PPC_NONE, PPC2_BOOKE206),
 GEN_HANDLER_E(stfdpx, 0x1F, 0x17, 0x1C, 0x0021, PPC_NONE, PPC2_ISA205),
 
 GEN_HANDLER(frsqrtes, 0x3B, 0x1A, 0xFF, 0x001F07C0, PPC_FLOAT_FRSQRTES),
-GEN_HANDLER(fsqrts, 0x3B, 0x16, 0xFF, 0x001F07C0, PPC_FLOAT_FSQRT),
 GEN_HANDLER(fcmpo, 0x3F, 0x00, 0x01, 0x0061, PPC_FLOAT),
 GEN_HANDLER(fcmpu, 0x3F, 0x00, 0x00, 0x0061, PPC_FLOAT),
 GEN_HANDLER(fabs, 0x3F, 0x08, 0x08, 0x001F, PPC_FLOAT),
-- 
2.37.3

Re: [PATCH] qboot: update to latest submodule

2022-09-20 Thread Jason A. Donenfeld

On Tue, Sep 20, 2022 at 11:57:09PM +0200, Paolo Bonzini wrote:
> It should have been automatic, there's mirroring set up.

Hm, something is weird. Gitlab says "This project is mirrored from
https://gitlab.com/bonzini/qboot.git. Pull mirroring updated 28 minutes
ago." yet the commit is much older than 28 minutes ago. Backend issue of
sorts?

[PULL 07/17] target/ppc: Remove extra space from s128 field in ppc_vsr_t

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

Very trivial rogue space removal. There are two spaces between Int128
and s128 in ppc_vsr_t struct, where it should be only one.

Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-2-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/cpu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 4551d81b5f..602ea77914 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -246,7 +246,7 @@ typedef union _ppc_vsr_t {
 #ifdef CONFIG_INT128
 __uint128_t u128;
 #endif
-Int128  s128;
+Int128 s128;
 } ppc_vsr_t;
 
 typedef ppc_vsr_t ppc_avr_t;
-- 
2.37.3

[PULL 01/17] target/ppc: Add HASHKEYR and HASHPKEYR SPRs

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

Add the Special Purpose Registers HASHKEYR and HASHPKEYR, which were
introduced by the Power ISA 3.1B. They are used by the new instructions
hashchk(p) and hashst(p).

The ISA states that the Operating System should generate the value for
these registers when creating a process, so it's its responsability to
do so. We initialize it with 0 for qemu-softmmu, and set a random 64
bits value for linux-user.

Signed-off-by: Víctor Colombo 
Reviewed-by: Lucas Mateus Castro 
Message-Id: <20220715205439.161110-2-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/cpu.h  |  2 ++
 target/ppc/cpu_init.c | 28 
 2 files changed, 30 insertions(+)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index a4c893cfad..4551d81b5f 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1676,6 +1676,8 @@ void ppc_compat_add_property(Object *obj, const char 
*name,
 #define SPR_BOOKE_GIVOR14 (0x1BD)
 #define SPR_TIR   (0x1BE)
 #define SPR_PTCR  (0x1D0)
+#define SPR_HASHKEYR  (0x1D4)
+#define SPR_HASHPKEYR (0x1D5)
 #define SPR_BOOKE_SPEFSCR (0x200)
 #define SPR_Exxx_BBEAR(0x201)
 #define SPR_Exxx_BBTAR(0x202)
diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
index 899c4a586e..6e080ebda0 100644
--- a/target/ppc/cpu_init.c
+++ b/target/ppc/cpu_init.c
@@ -5700,6 +5700,33 @@ static void register_power9_mmu_sprs(CPUPPCState *env)
 #endif
 }
 
+static void register_power10_hash_sprs(CPUPPCState *env)
+{
+/*
+ * it's the OS responsability to generate a random value for the registers
+ * in each process' context. So, initialize it with 0 here.
+ */
+uint64_t hashkeyr_initial_value = 0, hashpkeyr_initial_value = 0;
+#if defined(CONFIG_USER_ONLY)
+/* in linux-user, setup the hash register with a random value */
+GRand *rand = g_rand_new();
+hashkeyr_initial_value =
+((uint64_t)g_rand_int(rand) << 32) | (uint64_t)g_rand_int(rand);
+hashpkeyr_initial_value =
+((uint64_t)g_rand_int(rand) << 32) | (uint64_t)g_rand_int(rand);
+g_rand_free(rand);
+#endif
+spr_register(env, SPR_HASHKEYR, "HASHKEYR",
+SPR_NOACCESS, SPR_NOACCESS,
+_read_generic, _write_generic,
+hashkeyr_initial_value);
+spr_register_hv(env, SPR_HASHPKEYR, "HASHPKEYR",
+SPR_NOACCESS, SPR_NOACCESS,
+SPR_NOACCESS, SPR_NOACCESS,
+_read_generic, _write_generic,
+hashpkeyr_initial_value);
+}
+
 /*
  * Initialize PMU counter overflow timers for Power8 and
  * newer Power chips when using TCG.
@@ -6518,6 +6545,7 @@ static void init_proc_POWER10(CPUPPCState *env)
 register_power8_book4_sprs(env);
 register_power8_rpr_sprs(env);
 register_power9_mmu_sprs(env);
+register_power10_hash_sprs(env);
 
 /* FIXME: Filter fields properly based on privilege level */
 spr_register_kvm_hv(env, SPR_PSSCR, "PSSCR", NULL, NULL, NULL, NULL,
-- 
2.37.3

[PULL 04/17] target/ppc: Move fsqrt to decodetree

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

Signed-off-by: Víctor Colombo 
Reviewed-by: Richard Henderson 
Message-Id: <20220905123746.54659-2-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/insn32.decode   |  7 +++
 target/ppc/translate/fp-impl.c.inc | 29 +
 target/ppc/translate/fp-ops.c.inc  |  1 -
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode
index da08960fca..33aa27bd4f 100644
--- a/target/ppc/insn32.decode
+++ b/target/ppc/insn32.decode
@@ -20,6 +20,9 @@
   frt fra frb frc rc:bool
 @A  .. frt:5 fra:5 frb:5 frc:5 . rc:1   
 
+_tb   frt frb rc:bool
+@A_tb   .. frt:5 . frb:5 . . rc:1   _tb
+
   rt ra si:int64_t
 @D  .. rt:5 ra:5 si:s16 
 
@@ -363,6 +366,10 @@ STFDU   110111 . .. ... @D
 STFDX   01 . ..  1011010111 -   @X
 STFDUX  01 . ..  100111 -   @X
 
+### Floating-Point Arithmetic Instructions
+
+FSQRT   11 . - . - 10110 .  @A_tb
+
 ### Floating-Point Select Instruction
 
 FSEL11 . . . . 10111 .  @A
diff --git a/target/ppc/translate/fp-impl.c.inc 
b/target/ppc/translate/fp-impl.c.inc
index 0e893eafa7..e8359af005 100644
--- a/target/ppc/translate/fp-impl.c.inc
+++ b/target/ppc/translate/fp-impl.c.inc
@@ -254,29 +254,34 @@ static bool trans_FSEL(DisasContext *ctx, arg_A *a)
 GEN_FLOAT_AB(sub, 0x14, 0x07C0, 1, PPC_FLOAT);
 /* Optional: */
 
-/* fsqrt */
-static void gen_fsqrt(DisasContext *ctx)
+static bool do_helper_fsqrt(DisasContext *ctx, arg_A_tb *a,
+void (*helper)(TCGv_i64, TCGv_ptr, TCGv_i64))
 {
-TCGv_i64 t0;
-TCGv_i64 t1;
-if (unlikely(!ctx->fpu_enabled)) {
-gen_exception(ctx, POWERPC_EXCP_FPU);
-return;
-}
+TCGv_i64 t0, t1;
+
+REQUIRE_INSNS_FLAGS(ctx, FLOAT_FSQRT);
+REQUIRE_FPU(ctx);
+
 t0 = tcg_temp_new_i64();
 t1 = tcg_temp_new_i64();
+
 gen_reset_fpstatus();
-get_fpr(t0, rB(ctx->opcode));
-gen_helper_fsqrt(t1, cpu_env, t0);
-set_fpr(rD(ctx->opcode), t1);
+get_fpr(t0, a->frb);
+helper(t1, cpu_env, t0);
+set_fpr(a->frt, t1);
 gen_compute_fprf_float64(t1);
-if (unlikely(Rc(ctx->opcode) != 0)) {
+if (unlikely(a->rc != 0)) {
 gen_set_cr1_from_fpscr(ctx);
 }
+
 tcg_temp_free_i64(t0);
 tcg_temp_free_i64(t1);
+
+return true;
 }
 
+TRANS(FSQRT, do_helper_fsqrt, gen_helper_fsqrt);
+
 static void gen_fsqrts(DisasContext *ctx)
 {
 TCGv_i64 t0;
diff --git a/target/ppc/translate/fp-ops.c.inc 
b/target/ppc/translate/fp-ops.c.inc
index 1b65f5ab73..38759f5939 100644
--- a/target/ppc/translate/fp-ops.c.inc
+++ b/target/ppc/translate/fp-ops.c.inc
@@ -62,7 +62,6 @@ GEN_HANDLER_E(stfdepx, 0x1F, 0x1F, 0x16, 0x0001, 
PPC_NONE, PPC2_BOOKE206),
 GEN_HANDLER_E(stfdpx, 0x1F, 0x17, 0x1C, 0x0021, PPC_NONE, PPC2_ISA205),
 
 GEN_HANDLER(frsqrtes, 0x3B, 0x1A, 0xFF, 0x001F07C0, PPC_FLOAT_FRSQRTES),
-GEN_HANDLER(fsqrt, 0x3F, 0x16, 0xFF, 0x001F07C0, PPC_FLOAT_FSQRT),
 GEN_HANDLER(fsqrts, 0x3B, 0x16, 0xFF, 0x001F07C0, PPC_FLOAT_FSQRT),
 GEN_HANDLER(fcmpo, 0x3F, 0x00, 0x01, 0x0061, PPC_FLOAT),
 GEN_HANDLER(fcmpu, 0x3F, 0x00, 0x00, 0x0061, PPC_FLOAT),
-- 
2.37.3

[PULL 01/12] linux-user: Add missing signals in strace output

2022-09-20 Thread Helge Deller

Some of the guest signal numbers are currently not converted to
their representative names in the strace output, e.g. SIGVTALRM.

This patch introduces a smart way to generate and keep in sync the
host-to-guest and guest-to-host signal conversion tables for usage in
the qemu signal and strace code. This ensures that any signals
will now show up in both tables.

There is no functional change in this patch - with the exception that yet
missing signal names now show up in the strace code too.

Signed-off-by: Helge Deller 
---
 linux-user/signal-common.h | 46 ++
 linux-user/signal.c| 37 +++---
 linux-user/strace.c| 30 +
 3 files changed, 60 insertions(+), 53 deletions(-)

diff --git a/linux-user/signal-common.h b/linux-user/signal-common.h
index 6a7e4a93fc..3e2dc604c2 100644
--- a/linux-user/signal-common.h
+++ b/linux-user/signal-common.h
@@ -118,4 +118,50 @@ static inline void finish_sigsuspend_mask(int ret)
 }
 }

+#if defined(SIGSTKFLT) && defined(TARGET_SIGSTKFLT)
+#define MAKE_SIG_ENTRY_SIGSTKFLTMAKE_SIG_ENTRY(SIGSTKFLT)
+#else
+#define MAKE_SIG_ENTRY_SIGSTKFLT
+#endif
+
+#if defined(SIGIOT) && defined(TARGET_SIGIOT)
+#define MAKE_SIG_ENTRY_SIGIOT   MAKE_SIG_ENTRY(SIGIOT)
+#else
+#define MAKE_SIG_ENTRY_SIGIOT
+#endif
+
+#define MAKE_SIGNAL_LIST \
+MAKE_SIG_ENTRY(SIGHUP) \
+MAKE_SIG_ENTRY(SIGINT) \
+MAKE_SIG_ENTRY(SIGQUIT) \
+MAKE_SIG_ENTRY(SIGILL) \
+MAKE_SIG_ENTRY(SIGTRAP) \
+MAKE_SIG_ENTRY(SIGABRT) \
+MAKE_SIG_ENTRY(SIGBUS) \
+MAKE_SIG_ENTRY(SIGFPE) \
+MAKE_SIG_ENTRY(SIGKILL) \
+MAKE_SIG_ENTRY(SIGUSR1) \
+MAKE_SIG_ENTRY(SIGSEGV) \
+MAKE_SIG_ENTRY(SIGUSR2) \
+MAKE_SIG_ENTRY(SIGPIPE) \
+MAKE_SIG_ENTRY(SIGALRM) \
+MAKE_SIG_ENTRY(SIGTERM) \
+MAKE_SIG_ENTRY(SIGCHLD) \
+MAKE_SIG_ENTRY(SIGCONT) \
+MAKE_SIG_ENTRY(SIGSTOP) \
+MAKE_SIG_ENTRY(SIGTSTP) \
+MAKE_SIG_ENTRY(SIGTTIN) \
+MAKE_SIG_ENTRY(SIGTTOU) \
+MAKE_SIG_ENTRY(SIGURG) \
+MAKE_SIG_ENTRY(SIGXCPU) \
+MAKE_SIG_ENTRY(SIGXFSZ) \
+MAKE_SIG_ENTRY(SIGVTALRM) \
+MAKE_SIG_ENTRY(SIGPROF) \
+MAKE_SIG_ENTRY(SIGWINCH) \
+MAKE_SIG_ENTRY(SIGIO) \
+MAKE_SIG_ENTRY(SIGPWR) \
+MAKE_SIG_ENTRY(SIGSYS) \
+MAKE_SIG_ENTRY_SIGSTKFLT \
+MAKE_SIG_ENTRY_SIGIOT
+
 #endif
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 8d29bfaa6b..61c6fa3fcf 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -53,40 +53,9 @@ abi_ulong default_rt_sigreturn;
 QEMU_BUILD_BUG_ON(__SIGRTMAX + 1 != _NSIG);
 #endif
 static uint8_t host_to_target_signal_table[_NSIG] = {
-[SIGHUP] = TARGET_SIGHUP,
-[SIGINT] = TARGET_SIGINT,
-[SIGQUIT] = TARGET_SIGQUIT,
-[SIGILL] = TARGET_SIGILL,
-[SIGTRAP] = TARGET_SIGTRAP,
-[SIGABRT] = TARGET_SIGABRT,
-/*[SIGIOT] = TARGET_SIGIOT,*/
-[SIGBUS] = TARGET_SIGBUS,
-[SIGFPE] = TARGET_SIGFPE,
-[SIGKILL] = TARGET_SIGKILL,
-[SIGUSR1] = TARGET_SIGUSR1,
-[SIGSEGV] = TARGET_SIGSEGV,
-[SIGUSR2] = TARGET_SIGUSR2,
-[SIGPIPE] = TARGET_SIGPIPE,
-[SIGALRM] = TARGET_SIGALRM,
-[SIGTERM] = TARGET_SIGTERM,
-#ifdef SIGSTKFLT
-[SIGSTKFLT] = TARGET_SIGSTKFLT,
-#endif
-[SIGCHLD] = TARGET_SIGCHLD,
-[SIGCONT] = TARGET_SIGCONT,
-[SIGSTOP] = TARGET_SIGSTOP,
-[SIGTSTP] = TARGET_SIGTSTP,
-[SIGTTIN] = TARGET_SIGTTIN,
-[SIGTTOU] = TARGET_SIGTTOU,
-[SIGURG] = TARGET_SIGURG,
-[SIGXCPU] = TARGET_SIGXCPU,
-[SIGXFSZ] = TARGET_SIGXFSZ,
-[SIGVTALRM] = TARGET_SIGVTALRM,
-[SIGPROF] = TARGET_SIGPROF,
-[SIGWINCH] = TARGET_SIGWINCH,
-[SIGIO] = TARGET_SIGIO,
-[SIGPWR] = TARGET_SIGPWR,
-[SIGSYS] = TARGET_SIGSYS,
+#define MAKE_SIG_ENTRY(sig) [sig] = TARGET_##sig,
+MAKE_SIGNAL_LIST
+#undef MAKE_SIG_ENTRY
 /* next signals stay the same */
 };

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 7d882526da..a4eeef7ae1 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -17,6 +17,7 @@
 #include "qemu.h"
 #include "user-internals.h"
 #include "strace.h"
+#include "signal-common.h"

 struct syscallname {
 int nr;
@@ -141,30 +142,21 @@ if( cmd == val ) { \
 qemu_log("%d", cmd);
 }

+static const char * const target_signal_name[] = {
+#define MAKE_SIG_ENTRY(sig) [TARGET_##sig] = #sig,
+MAKE_SIGNAL_LIST
+#undef MAKE_SIG_ENTRY
+};
+
 static void
 print_signal(abi_ulong arg, int last)
 {
 const char *signal_name = NULL;
-switch(arg) {
-case TARGET_SIGHUP: signal_name = "SIGHUP"; break;
-case TARGET_SIGINT: signal_name = "SIGINT"; break;
-case TARGET_SIGQUIT: signal_name = "SIGQUIT"; break;
-case TARGET_SIGILL: signal_name = "SIGILL"; break;
-case TARGET_SIGABRT: signal_name = "SIGABRT"; break;
-case TARGET_SIGFPE: signal_name =

[PULL 08/12] linux-user/hppa: Set TASK_UNMAPPED_BASE to 0xfa000000 for hppa arch

2022-09-20 Thread Helge Deller

On the parisc architecture the stack grows upwards.
Move the TASK_UNMAPPED_BASE to high memory area as it's done by the
kernel on physical machines.

Signed-off-by: Helge Deller 
---
 linux-user/mmap.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 048c4135af..dba6823668 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -251,8 +251,12 @@ static int mmap_frag(abi_ulong real_start,
 # define TASK_UNMAPPED_BASE  (1ul << 38)
 #endif
 #else
+#ifdef TARGET_HPPA
+# define TASK_UNMAPPED_BASE  0xfa00
+#else
 # define TASK_UNMAPPED_BASE  0x4000
 #endif
+#endif
 abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;

 unsigned long last_brk;
--
2.37.3

Re: [PATCH v4 for 7.2 00/22] virtio-gpio and various virtio cleanups

2022-09-20 Thread Michael S. Tsirkin

On Tue, Sep 20, 2022 at 02:25:48PM -0400, Stefan Hajnoczi wrote:
> On Tue, 20 Sept 2022 at 10:18, Alex Bennée  wrote:
> >
> >
> > Stefan Hajnoczi  writes:
> >
> > > [[PGP Signed Part:Undecided]]
> > > On Fri, Sep 16, 2022 at 07:51:40AM +0100, Alex Bennée wrote:
> > >>
> > >> Alex Bennée  writes:
> > >>
> > >> > Hi,
> > >> >
> > >> > This is an update to the previous series which fixes the last few
> > >> > niggling CI failures I was seeing.
> > >> >
> > >> >Subject: [PATCH v3 for 7.2 00/21] virtio-gpio and various virtio 
> > >> > cleanups
> > >> >Date: Tue, 26 Jul 2022 20:21:29 +0100
> > >> >Message-Id: <20220726192150.2435175-1-alex.ben...@linaro.org>
> > >> >
> > >> > The CI failures were tricky to track down because they didn't occur
> > >> > locally but after patching to dump backtraces they all seem to involve
> > >> > updates to virtio_set_status() as the machine was torn down. I think
> > >> > patch that switches all users to use virtio_device_started() along
> > >> > with consistent checking of vhost_dev->started stops this from
> > >> > happening. The clean-up seems worthwhile in reducing boilerplate
> > >> > anyway.
> > >> >
> > >> > The following patches still need review:
> > >> >
> > >> >   - tests/qtest: enable tests for virtio-gpio
> > >> >   - tests/qtest: add a get_features op to vhost-user-test
> > >> >   - tests/qtest: implement stub for VHOST_USER_GET_CONFIG
> > >> >   - tests/qtest: add assert to catch bad features
> > >> >   - tests/qtest: plain g_assert for VHOST_USER_F_PROTOCOL_FEATURES
> > >> >   - tests/qtest: catch unhandled vhost-user messages
> > >> >   - tests/qtest: use qos_printf instead of g_test_message
> > >> >   - tests/qtest: pass stdout/stderr down to subtests
> > >> >   - hw/virtio: move vhd->started check into helper and add FIXME
> > >> >   - hw/virtio: move vm_running check to virtio_device_started
> > >> >   - hw/virtio: add some vhost-user trace events
> > >> >   - hw/virtio: log potentially buggy guest drivers
> > >> >   - hw/virtio: fix some coding style issues
> > >> >   - include/hw: document vhost_dev feature life-cycle
> > >> >   - include/hw/virtio: more comment for VIRTIO_F_BAD_FEATURE
> > >> >   - hw/virtio: fix vhost_user_read tracepoint
> > >> >   - hw/virtio: handle un-configured shutdown in virtio-pci
> > >> >   - hw/virtio: gracefully handle unset vhost_dev vdev
> > >> >   - hw/virtio: incorporate backend features in features
> > >> 
> > >>
> > >> Ping?
> > >
> > > Who are you pinging?
> > >
> > > Only qemu-devel is on To and there are a bunch of people on Cc.
> >
> > Well I guess MST is the maintainer for the sub-system but I was hoping
> > other virtio contributors had some sort of view. The process of
> > up-streaming a simple vhost-user stub has flushed out all sorts of
> > stuff.
> 
> Okay, moving MST to To in case it helps get his attention.
> 
> Thanks,
> Stefan

thanks, it's in my queue, just pulling in backlog that built up
during the forum.

-- 
MST

[PULL 10/12] linux-user: Show timespec on strace for futex()

2022-09-20 Thread Helge Deller

Signed-off-by: Helge Deller 
---
 linux-user/strace.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 6f818212d5..b6b9abaea4 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -3714,11 +3714,20 @@ print_futex(CPUArchState *cpu_env, const struct 
syscallname *name,
 abi_long arg0, abi_long arg1, abi_long arg2,
 abi_long arg3, abi_long arg4, abi_long arg5)
 {
+abi_long op = arg1 & FUTEX_CMD_MASK;
 print_syscall_prologue(name);
 print_pointer(arg0, 0);
 print_futex_op(arg1, 0);
 print_raw_param(",%d", arg2, 0);
-print_pointer(arg3, 0); /* struct timespec */
+switch (op) {
+case FUTEX_WAIT:
+case FUTEX_WAIT_BITSET:
+print_timespec(arg3, 0);
+break;
+default:
+print_pointer(arg3, 0);
+break;
+}
 print_pointer(arg4, 0);
 print_raw_param("%d", arg4, 1);
 print_syscall_epilogue(name);
--
2.37.3

Re: [PATCH 2/4] target/m68k: increase size of m68k CPU features from uint32_t to uint64_t

2022-09-20 Thread BALATON Zoltan


On Tue, 20 Sep 2022, Mark Cave-Ayland wrote:

On 17/09/2022 23:27, Philippe Mathieu-Daudé via wrote:


On 17/9/22 14:09, BALATON Zoltan wrote:

On Sat, 17 Sep 2022, Mark Cave-Ayland wrote:

There are already 32 feature bits in use, so change the size of the m68k
CPU features to uint64_t (allong with the associated m68k_feature()
functions) to allow up to 64 feature bits to be used.

Signed-off-by: Mark Cave-Ayland 
---
target/m68k/cpu.c | 4 ++--
target/m68k/cpu.h | 6 +++---
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c
index f681be3a2a..7b4797e2f1 100644
--- a/target/m68k/cpu.c
+++ b/target/m68k/cpu.c
@@ -38,12 +38,12 @@ static bool m68k_cpu_has_work(CPUState *cs)

static void m68k_set_feature(CPUM68KState *env, int feature)
{
-    env->features |= (1u << feature);
+    env->features |= (1ul << feature);


     env->features = deposit64(env->features, feature, 1, 1);


}

static void m68k_unset_feature(CPUM68KState *env, int feature)
{
-    env->features &= (-1u - (1u << feature));
+    env->features &= (-1ul - (1ul << feature));


     env->features = deposit64(env->features, feature, 1, 0);


Should these be ull instead of ul?


Yes. Not needed if using the  extract/deposit API.


I must admit I find the deposit64() variants not particularly easy to read:


I agree with that and also dislike the dposit/extract functions that would 
not bring much here. Maybe they are useful for multiple bits but for a 
single bit they just add overhead and obfuscation.


if we're considering alterations rather than changing the constant suffix 
then I'd much rather go for:


   env->features |= (1ULL << feature);

and:

   env->features &= ~(1ULL << feature);


There's also a BIT_ULL macro which could be used but it's up to you, 
shifting 1ULL is also simple enough to read.


Regards,
BALATON Zoltan


Laurent, what would be your preference?


}

static void m68k_cpu_reset(DeviceState *dev)
diff --git a/target/m68k/cpu.h b/target/m68k/cpu.h
index 67b6c12c28..d3384e5d98 100644
--- a/target/m68k/cpu.h
+++ b/target/m68k/cpu.h
@@ -154,7 +154,7 @@ typedef struct CPUArchState {
    struct {} end_reset_fields;

    /* Fields from here on are preserved across CPU reset. */
-    uint32_t features;
+    uint64_t features;
} CPUM68KState;

/*
@@ -539,9 +539,9 @@ enum m68k_features {
    M68K_FEATURE_TRAPCC,
};

-static inline int m68k_feature(CPUM68KState *env, int feature)
+static inline uint64_t m68k_feature(CPUM68KState *env, int feature)


Why uint64_t? Can we simplify using a boolean?


I don't really feel strongly either way here. Again I'm happy to go with 
whatever Laurent would prefer as maintainer.



{
-    return (env->features & (1u << feature)) != 0;
+    return (env->features & (1ul << feature)) != 0;


     return extract64(env->features, feature, 1) == 1;


}

void m68k_cpu_list(void);



ATB,

Mark.

Re: [PATCH v4 for 7.2 00/22] virtio-gpio and various virtio cleanups

2022-09-20 Thread Stefan Hajnoczi

On Tue, 20 Sept 2022 at 10:18, Alex Bennée  wrote:
>
>
> Stefan Hajnoczi  writes:
>
> > [[PGP Signed Part:Undecided]]
> > On Fri, Sep 16, 2022 at 07:51:40AM +0100, Alex Bennée wrote:
> >>
> >> Alex Bennée  writes:
> >>
> >> > Hi,
> >> >
> >> > This is an update to the previous series which fixes the last few
> >> > niggling CI failures I was seeing.
> >> >
> >> >Subject: [PATCH v3 for 7.2 00/21] virtio-gpio and various virtio 
> >> > cleanups
> >> >Date: Tue, 26 Jul 2022 20:21:29 +0100
> >> >Message-Id: <20220726192150.2435175-1-alex.ben...@linaro.org>
> >> >
> >> > The CI failures were tricky to track down because they didn't occur
> >> > locally but after patching to dump backtraces they all seem to involve
> >> > updates to virtio_set_status() as the machine was torn down. I think
> >> > patch that switches all users to use virtio_device_started() along
> >> > with consistent checking of vhost_dev->started stops this from
> >> > happening. The clean-up seems worthwhile in reducing boilerplate
> >> > anyway.
> >> >
> >> > The following patches still need review:
> >> >
> >> >   - tests/qtest: enable tests for virtio-gpio
> >> >   - tests/qtest: add a get_features op to vhost-user-test
> >> >   - tests/qtest: implement stub for VHOST_USER_GET_CONFIG
> >> >   - tests/qtest: add assert to catch bad features
> >> >   - tests/qtest: plain g_assert for VHOST_USER_F_PROTOCOL_FEATURES
> >> >   - tests/qtest: catch unhandled vhost-user messages
> >> >   - tests/qtest: use qos_printf instead of g_test_message
> >> >   - tests/qtest: pass stdout/stderr down to subtests
> >> >   - hw/virtio: move vhd->started check into helper and add FIXME
> >> >   - hw/virtio: move vm_running check to virtio_device_started
> >> >   - hw/virtio: add some vhost-user trace events
> >> >   - hw/virtio: log potentially buggy guest drivers
> >> >   - hw/virtio: fix some coding style issues
> >> >   - include/hw: document vhost_dev feature life-cycle
> >> >   - include/hw/virtio: more comment for VIRTIO_F_BAD_FEATURE
> >> >   - hw/virtio: fix vhost_user_read tracepoint
> >> >   - hw/virtio: handle un-configured shutdown in virtio-pci
> >> >   - hw/virtio: gracefully handle unset vhost_dev vdev
> >> >   - hw/virtio: incorporate backend features in features
> >> 
> >>
> >> Ping?
> >
> > Who are you pinging?
> >
> > Only qemu-devel is on To and there are a bunch of people on Cc.
>
> Well I guess MST is the maintainer for the sub-system but I was hoping
> other virtio contributors had some sort of view. The process of
> up-streaming a simple vhost-user stub has flushed out all sorts of
> stuff.

Okay, moving MST to To in case it helps get his attention.

Thanks,
Stefan

Re: QEMU's FreeBSD 13 CI job is failing

2022-09-20 Thread Warner Losh

On Tue, Sep 20, 2022 at 2:57 AM Daniel P. Berrangé 
wrote:

> On Tue, Sep 20, 2022 at 10:23:56AM +0200, Thomas Huth wrote:
> > On 20/09/2022 10.21, Daniel P. Berrangé wrote:
> > > On Tue, Sep 20, 2022 at 08:44:27AM +0200, Thomas Huth wrote:
> > > >
> > > > Seen here for example:
> > > >
> > > > https://gitlab.com/qemu-project/qemu/-/jobs/3050165356#L2543
> > > >
> > > > ld-elf.so.1: /lib/libc.so.7: version FBSD_1.7 required by
> > > > /usr/local/lib/libpython3.9.so.1.0 not found
> > > > ERROR: Cannot use '/usr/local/bin/python3', Python >= 3.6 is
> required.
> > > >
> > > > ... looks like the Python binary is not working anymore? Does
> anybody know
> > > > what happened here?
> > >
> > > FreeBSD ports is only guaranteed to work with latest minor release
> > > base image. The python binary recently started relying on symbols
> > > in the 13.1 base image, and we're using 13.0.
> > >
> > > I updated lcitool last week to pick 13.1, so we just need a refresh
> > > on the QEMU side to pick this up.
> >
> > OK ... Alex, IIRC you have a patch queued to update the files that are
> > refreshed by lcitool ... does that already contain the update for
> FreeBSD,
> > too?
>
> Oh actually, I'm forgetting that QEMU doesn't use the 'lcitool manifest'
> command for auto-generating the gitlab-ci.yml file. In QEMU's case just
> manually edit .gitlab-ci.d/cirrus.yml to change
>
> CIRRUS_VM_IMAGE_NAME: freebsd-13-0
>

FreeBSD's support policy is that we EOL minor dot releases a few months
after
the next minor release is final. Part of that process involves moving the
build
of packages to that new minor version (which is what's not guaranteed to
work
on older versions... only old binaries on new versions is guaranteed)...
And that's
the problem that was hit here.

I'll try to submit changes after the next minor release in that 'few month'
window
to update this in the future. In general, doing so would be the best fit
with FreeBSD's
support model...  It's one of those things I didn't think of at the time,
but is obvious in
hindsight.

Warner

[PULL 03/12] linux-user: Add pidfd_open(), pidfd_send_signal() and pidfd_getfd() syscalls

2022-09-20 Thread Helge Deller

I noticed those were missing when running the glib2.0 testsuite.
Add the syscalls including the strace output.

Signed-off-by: Helge Deller 
---
 linux-user/strace.c| 28 
 linux-user/strace.list |  9 +
 linux-user/syscall.c   | 34 ++
 3 files changed, 71 insertions(+)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 816e679995..5ac64df02b 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -3317,6 +3317,34 @@ print_openat(CPUArchState *cpu_env, const struct 
syscallname *name,
 }
 #endif

+#ifdef TARGET_NR_pidfd_send_signal
+static void
+print_pidfd_send_signal(CPUArchState *cpu_env, const struct syscallname *name,
+abi_long arg0, abi_long arg1, abi_long arg2,
+abi_long arg3, abi_long arg4, abi_long arg5)
+{
+void *p;
+target_siginfo_t uinfo;
+
+print_syscall_prologue(name);
+print_raw_param("%d", arg0, 0);
+print_signal(arg1, 0);
+
+p = lock_user(VERIFY_READ, arg2, sizeof(target_siginfo_t), 1);
+if (p) {
+get_target_siginfo(, p);
+print_siginfo();
+
+unlock_user(p, arg2, 0);
+} else {
+print_pointer(arg2, 1);
+}
+
+print_raw_param("%u", arg3, 0);
+print_syscall_epilogue(name);
+}
+#endif
+
 #ifdef TARGET_NR_mq_unlink
 static void
 print_mq_unlink(CPUArchState *cpu_env, const struct syscallname *name,
diff --git a/linux-user/strace.list b/linux-user/strace.list
index a78cdf3cdf..4d8b7f6a5e 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -1664,6 +1664,15 @@
 #ifdef TARGET_NR_pipe2
 { TARGET_NR_pipe2, "pipe2", NULL, NULL, NULL },
 #endif
+#ifdef TARGET_NR_pidfd_open
+{ TARGET_NR_pidfd_open, "pidfd_open", "%s(%d,%u)", NULL, NULL },
+#endif
+#ifdef TARGET_NR_pidfd_send_signal
+{ TARGET_NR_pidfd_send_signal, "pidfd_send_signal", NULL, 
print_pidfd_send_signal, NULL },
+#endif
+#ifdef TARGET_NR_pidfd_getfd
+{ TARGET_NR_pidfd_getfd, "pidfd_getfd", "%s(%d,%d,%u)", NULL, NULL },
+#endif
 #ifdef TARGET_NR_atomic_cmpxchg_32
 { TARGET_NR_atomic_cmpxchg_32, "atomic_cmpxchg_32", NULL, NULL, NULL },
 #endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index f409121202..ca39acfceb 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -346,6 +346,16 @@ _syscall6(int,sys_futex,int *,uaddr,int,op,int,val,
 _syscall6(int,sys_futex_time64,int *,uaddr,int,op,int,val,
   const struct timespec *,timeout,int *,uaddr2,int,val3)
 #endif
+#if defined(__NR_pidfd_open) && defined(TARGET_NR_pidfd_open)
+_syscall2(int, pidfd_open, pid_t, pid, unsigned int, flags);
+#endif
+#if defined(__NR_pidfd_send_signal) && defined(TARGET_NR_pidfd_send_signal)
+_syscall4(int, pidfd_send_signal, int, pidfd, int, sig, siginfo_t *, info,
+ unsigned int, flags);
+#endif
+#if defined(__NR_pidfd_getfd) && defined(TARGET_NR_pidfd_getfd)
+_syscall3(int, pidfd_getfd, int, pidfd, int, targetfd, unsigned int, flags);
+#endif
 #define __NR_sys_sched_getaffinity __NR_sched_getaffinity
 _syscall3(int, sys_sched_getaffinity, pid_t, pid, unsigned int, len,
   unsigned long *, user_mask_ptr);
@@ -8683,6 +8693,30 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 ret = do_open_by_handle_at(arg1, arg2, arg3);
 fd_trans_unregister(ret);
 return ret;
+#endif
+#if defined(__NR_pidfd_open) && defined(TARGET_NR_pidfd_open)
+case TARGET_NR_pidfd_open:
+return get_errno(pidfd_open(arg1, arg2));
+#endif
+#if defined(__NR_pidfd_send_signal) && defined(TARGET_NR_pidfd_send_signal)
+case TARGET_NR_pidfd_send_signal:
+{
+siginfo_t uinfo;
+
+p = lock_user(VERIFY_READ, arg3, sizeof(target_siginfo_t), 1);
+if (!p) {
+return -TARGET_EFAULT;
+}
+target_to_host_siginfo(, p);
+unlock_user(p, arg3, 0);
+ret = get_errno(pidfd_send_signal(arg1, 
target_to_host_signal(arg2),
+, arg4));
+}
+return ret;
+#endif
+#if defined(__NR_pidfd_getfd) && defined(TARGET_NR_pidfd_getfd)
+case TARGET_NR_pidfd_getfd:
+return get_errno(pidfd_getfd(arg1, arg2, arg3));
 #endif
 case TARGET_NR_close:
 fd_trans_unregister(arg1);
--
2.37.3

[PULL 07/12] linux-user: Fix strace of chmod() if mode == 0

2022-09-20 Thread Helge Deller

If the mode parameter of chmod() is zero, this value isn't shown
when stracing a program:
chmod("filename",)
This patch fixes it up to show the zero-value as well:
chmod("filename",000)

Signed-off-by: Helge Deller 
---
 linux-user/strace.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index 5ac64df02b..2f539845bb 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -1505,6 +1505,11 @@ print_file_mode(abi_long mode, int last)
 const char *sep = "";
 const struct flags *m;

+if (mode == 0) {
+qemu_log("000%s", get_comma(last));
+return;
+}
+
 for (m = _flags[0]; m->f_string != NULL; m++) {
 if ((m->f_value & mode) == m->f_value) {
 qemu_log("%s%s", m->f_string, sep);
--
2.37.3

Re: [Virtio-fs] [PATCH] virtiofsd: use g_date_time_get_microsecond to get subsecond

2022-09-20 Thread Vivek Goyal

On Wed, Aug 24, 2022 at 01:41:29PM -0400, Stefan Hajnoczi wrote:
> On Thu, Aug 18, 2022 at 02:46:19PM -0400, Yusuke Okada wrote:
> > From: Yusuke Okada 
> > 
> > The "%f" specifier in g_date_time_format() is only available in glib
> > 2.65.2 or later. If combined with older glib, the function returns null
> > and the timestamp displayed as "(null)".
> > 
> > For backward compatibility, g_date_time_get_microsecond should be used
> > to retrieve subsecond.
> > 
> > In this patch the g_date_time_format() leaves subsecond field as "%06d"
> > and let next snprintf to format with g_date_time_get_microsecond.
> > 
> > Signed-off-by: Yusuke Okada 
> > ---
> >  tools/virtiofsd/passthrough_ll.c | 7 +--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> Thanks, applied to my block tree for QEMU 7.2:
> https://gitlab.com/stefanha/qemu/commits/block

Hi Stefan,

Wondering when do you plan to send it for merge. This seems like
a simple fix. Not sure why it does not qualify as a fix for
7.1 instead.

Thanks
Vivek

[PULL 14/17] target/ppc: Clear fpstatus flags on helpers missing it

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

In ppc emulation, exception flags are not cleared at the end of an
instruction. Instead, the next instruction is responsible to clear
it before its emulation. However, some helpers are not doing it,
causing an issue where the previously set exception flags are being
used and leading to incorrect values being set in FPSCR.
Fix this by clearing fp_status before doing the instruction 'real' work
for the following helpers that were missing this behavior:

- VSX_CVT_INT_TO_FP_VECTOR
- VSX_CVT_FP_TO_FP
- VSX_CVT_FP_TO_INT_VECTOR
- VSX_CVT_FP_TO_INT2
- VSX_CVT_FP_TO_INT
- VSX_CVT_FP_TO_FP_HP
- VSX_CVT_FP_TO_FP_VECTOR
- VSX_CMP
- VSX_ROUND
- xscvqpdp
- xscvdpsp[n]

Signed-off-by: Víctor Colombo 
Reviewed-by: Daniel Henrique Barboza 
Message-Id: <20220906125523.38765-9-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/fpu_helper.c | 37 ++---
 1 file changed, 26 insertions(+), 11 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index f07330ffc1..ae25f32d6e 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -2628,6 +2628,8 @@ uint32_t helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
\
 int all_true = 1; \
 int all_false = 1;\
   \
+helper_reset_fpstatus(env);   \
+  \
 for (i = 0; i < nels; i++) {  \
 if (unlikely(tp##_is_any_nan(xa->fld) ||  \
  tp##_is_any_nan(xb->fld))) { \
@@ -2681,6 +2683,8 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
ppc_vsr_t *xb)   \
 ppc_vsr_t t = { }; \
 int i; \
\
+helper_reset_fpstatus(env);\
+   \
 for (i = 0; i < nels; i++) {   \
 t.tfld = stp##_to_##ttp(xb->sfld, >fp_status);\
 if (unlikely(stp##_is_signaling_nan(xb->sfld,  \
@@ -2706,6 +2710,8 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
ppc_vsr_t *xb)  \
 ppc_vsr_t t = { };\
 int i;\
   \
+helper_reset_fpstatus(env);   \
+  \
 for (i = 0; i < nels; i++) {  \
 t.VsrW(2 * i) = stp##_to_##ttp(xb->VsrD(i), >fp_status); \
 if (unlikely(stp##_is_signaling_nan(xb->VsrD(i),  \
@@ -2743,6 +2749,8 @@ void helper_##op(CPUPPCState *env, uint32_t opcode,   
  \
 ppc_vsr_t t = *xt;  \
 int i;  \
 \
+helper_reset_fpstatus(env); \
+\
 for (i = 0; i < nels; i++) {\
 t.tfld = stp##_to_##ttp(xb->sfld, >fp_status); \
 if (unlikely(stp##_is_signaling_nan(xb->sfld,   \
@@ -2778,6 +2786,8 @@ void helper_##op(CPUPPCState *env, ppc_vsr_t *xt, 
ppc_vsr_t *xb)   \
 ppc_vsr_t t = { }; \
 int i; \
\
+helper_reset_fpstatus(env);\
+   \
 for (i = 0; i < nels; i++) {   \
 t.tfld = stp##_to_##ttp(xb->sfld, 1, >fp_status); \
 if (unlikely(stp##_is_signaling_nan(xb->sfld,  \
@@ -2825,6 +2835,8 @@ void helper_XSCVQPDP(CPUPPCState *env, uint32_t ro, 
ppc_vsr_t *xt,
 ppc_vsr_t t = { };
 float_status tstat;
 
+helper_reset_fpstatus(env);
+
 tstat = env->fp_status;
 if (ro != 0) {
 tstat.float_rounding_mode = float_round_to_odd;
@@ -2846,6 +2858,7 @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb)
 {
 uint64_t result, sign, exp, frac;
 
+helper_reset_fpstatus(env);
 float_status tstat =

[PATCH v2 37/37] target/i386: remove old SSE decoder

2022-09-20 Thread Paolo Bonzini

With all SSE (and AVX!) instructions now implemented in disas_insn_new,
it's possible to remove gen_sse, as well as the helpers for instructions
that now use gvec.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/ops_sse.h|  124 ---
 target/i386/ops_sse_header.h |   61 --
 target/i386/tcg/decode-new.c.inc |3 -
 target/i386/tcg/emit.c.inc   |   17 +
 target/i386/tcg/translate.c  | 1722 +-
 5 files changed, 19 insertions(+), 1908 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 43b32edbfc..76bf20b878 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -297,17 +297,6 @@ static inline int satsw(int x)
 #define FMAXUB(a, b) ((a) > (b)) ? (a) : (b)
 #define FMAXSW(a, b) ((int16_t)(a) > (int16_t)(b)) ? (a) : (b)
 
-#define FAND(a, b) ((a) & (b))
-#define FANDN(a, b) ((~(a)) & (b))
-#define FOR(a, b) ((a) | (b))
-#define FXOR(a, b) ((a) ^ (b))
-
-#define FCMPGTB(a, b) ((int8_t)(a) > (int8_t)(b) ? -1 : 0)
-#define FCMPGTW(a, b) ((int16_t)(a) > (int16_t)(b) ? -1 : 0)
-#define FCMPGTL(a, b) ((int32_t)(a) > (int32_t)(b) ? -1 : 0)
-#define FCMPEQ(a, b) ((a) == (b) ? -1 : 0)
-
-#define FMULLW(a, b) ((a) * (b))
 #define FMULHRW(a, b) (((int16_t)(a) * (int16_t)(b) + 0x8000) >> 16)
 #define FMULHUW(a, b) ((a) * (b) >> 16)
 #define FMULHW(a, b) ((int16_t)(a) * (int16_t)(b) >> 16)
@@ -315,46 +304,6 @@ static inline int satsw(int x)
 #define FAVG(a, b) (((a) + (b) + 1) >> 1)
 #endif
 
-SSE_HELPER_B(helper_paddb, FADD)
-SSE_HELPER_W(helper_paddw, FADD)
-SSE_HELPER_L(helper_paddl, FADD)
-SSE_HELPER_Q(helper_paddq, FADD)
-
-SSE_HELPER_B(helper_psubb, FSUB)
-SSE_HELPER_W(helper_psubw, FSUB)
-SSE_HELPER_L(helper_psubl, FSUB)
-SSE_HELPER_Q(helper_psubq, FSUB)
-
-SSE_HELPER_B(helper_paddusb, FADDUB)
-SSE_HELPER_B(helper_paddsb, FADDSB)
-SSE_HELPER_B(helper_psubusb, FSUBUB)
-SSE_HELPER_B(helper_psubsb, FSUBSB)
-
-SSE_HELPER_W(helper_paddusw, FADDUW)
-SSE_HELPER_W(helper_paddsw, FADDSW)
-SSE_HELPER_W(helper_psubusw, FSUBUW)
-SSE_HELPER_W(helper_psubsw, FSUBSW)
-
-SSE_HELPER_B(helper_pminub, FMINUB)
-SSE_HELPER_B(helper_pmaxub, FMAXUB)
-
-SSE_HELPER_W(helper_pminsw, FMINSW)
-SSE_HELPER_W(helper_pmaxsw, FMAXSW)
-
-SSE_HELPER_Q(helper_pand, FAND)
-SSE_HELPER_Q(helper_pandn, FANDN)
-SSE_HELPER_Q(helper_por, FOR)
-SSE_HELPER_Q(helper_pxor, FXOR)
-
-SSE_HELPER_B(helper_pcmpgtb, FCMPGTB)
-SSE_HELPER_W(helper_pcmpgtw, FCMPGTW)
-SSE_HELPER_L(helper_pcmpgtl, FCMPGTL)
-
-SSE_HELPER_B(helper_pcmpeqb, FCMPEQ)
-SSE_HELPER_W(helper_pcmpeqw, FCMPEQ)
-SSE_HELPER_L(helper_pcmpeql, FCMPEQ)
-
-SSE_HELPER_W(helper_pmullw, FMULLW)
 SSE_HELPER_W(helper_pmulhuw, FMULHUW)
 SSE_HELPER_W(helper_pmulhw, FMULHW)
 
@@ -432,29 +381,6 @@ void glue(helper_maskmov, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *s,
 }
 #endif
 
-void glue(helper_movl_mm_T0, SUFFIX)(Reg *d, uint32_t val)
-{
-int i;
-
-d->L(0) = val;
-d->L(1) = 0;
-for (i = 1; i < (1 << SHIFT); i++) {
-d->Q(i) = 0;
-}
-}
-
-#ifdef TARGET_X86_64
-void glue(helper_movq_mm_T0, SUFFIX)(Reg *d, uint64_t val)
-{
-int i;
-
-d->Q(0) = val;
-for (i = 1; i < (1 << SHIFT); i++) {
-d->Q(i) = 0;
-}
-}
-#endif
-
 #define SHUFFLE4(F, a, b, offset) do {  \
 r0 = a->F((order & 3) + offset);\
 r1 = a->F(((order >> 2) & 3) + offset); \
@@ -1216,27 +1142,6 @@ uint32_t glue(helper_movmskpd, SUFFIX)(CPUX86State *env, 
Reg *s)
 
 #endif
 
-uint32_t glue(helper_pmovmskb, SUFFIX)(CPUX86State *env, Reg *s)
-{
-uint32_t val;
-int i;
-
-val = 0;
-for (i = 0; i < (1 << SHIFT); i++) {
-uint8_t byte = 0;
-byte |= (s->B(8 * i + 0) >> 7);
-byte |= (s->B(8 * i + 1) >> 6) & 0x02;
-byte |= (s->B(8 * i + 2) >> 5) & 0x04;
-byte |= (s->B(8 * i + 3) >> 4) & 0x08;
-byte |= (s->B(8 * i + 4) >> 3) & 0x10;
-byte |= (s->B(8 * i + 5) >> 2) & 0x20;
-byte |= (s->B(8 * i + 6) >> 1) & 0x40;
-byte |= (s->B(8 * i + 7)) & 0x80;
-val |= byte << (8 * i);
-}
-return val;
-}
-
 #define PACK_HELPER_B(name, F) \
 void glue(helper_pack ## name, SUFFIX)(CPUX86State *env,  \
 Reg *d, Reg *v, Reg *s)   \
@@ -1587,13 +1492,6 @@ void glue(helper_pmaddubsw, SUFFIX)(CPUX86State *env, 
Reg *d, Reg *v, Reg *s)
 }
 }
 
-#define FABSB(x) (x > INT8_MAX  ? -(int8_t)x : x)
-#define FABSW(x) (x > INT16_MAX ? -(int16_t)x : x)
-#define FABSL(x) (x > INT32_MAX ? -(int32_t)x : x)
-SSE_HELPER_1(helper_pabsb, B, 8 << SHIFT, FABSB)
-SSE_HELPER_1(helper_pabsw, W, 4 << SHIFT, FABSW)
-SSE_HELPER_1(helper_pabsd, L, 2 << SHIFT, FABSL)
-
 #define FMULHRSW(d, s) (((int16_t) d * (int16_t)s + 0x4000) >> 15)
 SSE_HELPER_W(helper_pmulhrsw, FMULHRSW)
 
@@ -1723,9 +1621,6 @@ void glue(helper_pmuldq, SUFFIX)(CPUX86State *env, Reg 
*d, Reg *v, Reg *s)
 }
 }
 
-#define FCMPEQQ(d, s) (d == s ? -1 : 0)
-SSE_HELPER_Q(helper_pcmpeqq, FCMPEQQ)
-
 void

[PULL 06/17] target/ppc: Merge fsqrt and fsqrts helpers

2022-09-20 Thread Daniel Henrique Barboza

From: Víctor Colombo 

These two helpers are almost identical, differing only by the softfloat
operation it calls. Merge them into one using a macro.
Also, take this opportunity to capitalize the helper name as we moved
the instruction to decodetree in a previous patch.

Signed-off-by: Víctor Colombo 
Reviewed-by: Richard Henderson 
Message-Id: <20220905123746.54659-4-victor.colo...@eldorado.org.br>
Signed-off-by: Daniel Henrique Barboza 
---
 target/ppc/fpu_helper.c| 35 +++---
 target/ppc/helper.h|  4 ++--
 target/ppc/translate/fp-impl.c.inc |  4 ++--
 3 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c
index 0f045b70f8..32995179b5 100644
--- a/target/ppc/fpu_helper.c
+++ b/target/ppc/fpu_helper.c
@@ -830,30 +830,21 @@ static void float_invalid_op_sqrt(CPUPPCState *env, int 
flags,
 }
 }
 
-/* fsqrt - fsqrt. */
-float64 helper_fsqrt(CPUPPCState *env, float64 arg)
-{
-float64 ret = float64_sqrt(arg, >fp_status);
-int flags = get_float_exception_flags(>fp_status);
-
-if (unlikely(flags & float_flag_invalid)) {
-float_invalid_op_sqrt(env, flags, 1, GETPC());
-}
-
-return ret;
+#define FPU_FSQRT(name, op)   \
+float64 helper_##name(CPUPPCState *env, float64 arg)  \
+{ \
+float64 ret = op(arg, >fp_status);   \
+int flags = get_float_exception_flags(>fp_status);   \
+  \
+if (unlikely(flags & float_flag_invalid)) {   \
+float_invalid_op_sqrt(env, flags, 1, GETPC());\
+} \
+  \
+return ret;   \
 }
 
-/* fsqrts - fsqrts. */
-float64 helper_fsqrts(CPUPPCState *env, float64 arg)
-{
-float64 ret = float64r32_sqrt(arg, >fp_status);
-int flags = get_float_exception_flags(>fp_status);
-
-if (unlikely(flags & float_flag_invalid)) {
-float_invalid_op_sqrt(env, flags, 1, GETPC());
-}
-return ret;
-}
+FPU_FSQRT(FSQRT, float64_sqrt)
+FPU_FSQRT(FSQRTS, float64r32_sqrt)
 
 /* fre - fre. */
 float64 helper_fre(CPUPPCState *env, float64 arg)
diff --git a/target/ppc/helper.h b/target/ppc/helper.h
index 122b2e9359..57eee07256 100644
--- a/target/ppc/helper.h
+++ b/target/ppc/helper.h
@@ -120,8 +120,8 @@ DEF_HELPER_4(fmadds, i64, env, i64, i64, i64)
 DEF_HELPER_4(fmsubs, i64, env, i64, i64, i64)
 DEF_HELPER_4(fnmadds, i64, env, i64, i64, i64)
 DEF_HELPER_4(fnmsubs, i64, env, i64, i64, i64)
-DEF_HELPER_2(fsqrt, f64, env, f64)
-DEF_HELPER_2(fsqrts, f64, env, f64)
+DEF_HELPER_2(FSQRT, f64, env, f64)
+DEF_HELPER_2(FSQRTS, f64, env, f64)
 DEF_HELPER_2(fre, i64, env, i64)
 DEF_HELPER_2(fres, i64, env, i64)
 DEF_HELPER_2(frsqrte, i64, env, i64)
diff --git a/target/ppc/translate/fp-impl.c.inc 
b/target/ppc/translate/fp-impl.c.inc
index 7a90c0e350..8d5cf0f982 100644
--- a/target/ppc/translate/fp-impl.c.inc
+++ b/target/ppc/translate/fp-impl.c.inc
@@ -280,8 +280,8 @@ static bool do_helper_fsqrt(DisasContext *ctx, arg_A_tb *a,
 return true;
 }
 
-TRANS(FSQRT, do_helper_fsqrt, gen_helper_fsqrt);
-TRANS(FSQRTS, do_helper_fsqrt, gen_helper_fsqrts);
+TRANS(FSQRT, do_helper_fsqrt, gen_helper_FSQRT);
+TRANS(FSQRTS, do_helper_fsqrt, gen_helper_FSQRTS);
 
 /*** Floating-Point multiply-and-add   ***/
 /* fmadd - fmadds */
-- 
2.37.3

[PULL 00/17] ppc queue

2022-09-20 Thread Daniel Henrique Barboza

The following changes since commit d29201ff34a135cdfc197f4413c1c5047e4f58bb:

  Merge tag 'pull-hmp-20220915a' of https://gitlab.com/dagrh/qemu into staging 
(2022-09-17 10:31:11 -0400)

are available in the Git repository at:

  https://gitlab.com/danielhb/qemu.git tags/pull-ppc-20220920

for you to fetch changes up to 6b5cf264ee76d24b357a60b69b0635a533c1f647:

  hw/ppc/spapr: Fix code style problems reported by checkpatch (2022-09-20 
12:31:53 -0300)


ppc patch queue for 2022-09-20:

This queue contains a implementation of PowerISA 3.1B hash insns, ppc
TCG insns cleanups and fixes, and miscellaneus fixes in the spapr and
pnv_phb models.


Bernhard Beschow (1):
  hw/ppc/spapr: Fix code style problems reported by checkpatch

Víctor Colombo (14):
  target/ppc: Add HASHKEYR and HASHPKEYR SPRs
  target/ppc: Implement hashst and hashchk
  target/ppc: Implement hashstp and hashchkp
  target/ppc: Move fsqrt to decodetree
  target/ppc: Move fsqrts to decodetree
  target/ppc: Merge fsqrt and fsqrts helpers
  target/ppc: Remove extra space from s128 field in ppc_vsr_t
  target/ppc: Remove unused xer_* macros
  target/ppc: Zero second doubleword in DFP instructions
  target/ppc: Set result to QNaN for DENBCD when VXCVI occurs
  target/ppc: Zero second doubleword for VSX madd instructions
  target/ppc: Set OV32 when OV is set
  target/ppc: Zero second doubleword of VSR registers for FPR insns
  target/ppc: Clear fpstatus flags on helpers missing it

Xuzhou Cheng (2):
  hw/ppc: spapr: Use qemu_vfree() to free spapr->htab
  hw/pci-host: pnv_phb{3, 4}: Fix heap out-of-bound access failure

 hw/pci-host/pnv_phb3.c |  1 +
 hw/pci-host/pnv_phb4.c |  1 +
 hw/ppc/spapr.c |  2 +-
 include/hw/ppc/spapr.h |  5 +-
 target/ppc/cpu.h   |  8 ++-
 target/ppc/cpu_init.c  | 28 ++
 target/ppc/dfp_helper.c| 31 +--
 target/ppc/excp_helper.c   | 83 ++
 target/ppc/fpu_helper.c| 74 ++
 target/ppc/helper.h|  8 ++-
 target/ppc/insn32.decode   | 18 +++
 target/ppc/int_helper.c|  4 +-
 target/ppc/translate.c | 13 +
 target/ppc/translate/fixedpoint-impl.c.inc | 34 
 target/ppc/translate/fp-impl.c.inc | 50 ++
 target/ppc/translate/fp-ops.c.inc  |  2 -
 16 files changed, 278 insertions(+), 84 deletions(-)

[PULL 00/12] Publish1 patches

2022-09-20 Thread Helge Deller

The following changes since commit 621da7789083b80d6f1ff1c0fb499334007b4f51:

  Update version for v7.1.0 release (2022-08-30 09:40:11 -0700)

are available in the Git repository at:

  https://github.com/hdeller/qemu-hppa.git tags/publish1-pull-request

for you to fetch changes up to 7f8674a61a908592bb4e8e698f5bef84d0eeb8cc:

  linux-user: Add parameters of getrandom() syscall for strace (2022-09-18 
21:35:27 +0200)


linux-user: Add more syscalls, enhance tracing & logging enhancements

Here is a bunch of patches for linux-user.

Most of them add missing syscalls and enhance the tracing/logging.
Some of the patches are target-hppa specific.
I've tested those on productive hppa debian buildd servers (running qemu-user).

Thanks!
Helge

Changes to v2:
- Fix build of close_range() and pidfd_*() patches on older Linux
  distributions (noticed by Stefan Hajnoczi)

Changes to v1:
- Dropped the faccessat2() syscall patch in favour of Richard's patch
- Various changes to the "missing signals in strace output" patch based on
  Richard's feedback, e.g. static arrays, fixed usage of _NSIG, fix build when
  TARGET_SIGIOT does not exist
- Use FUTEX_CMD_MASK in "Show timespec on strace for futex" patch
  unconditionally and turn into a switch statement - as suggested by Richard



Helge Deller (12):
  linux-user: Add missing signals in strace output
  linux-user: Add missing clock_gettime64() syscall strace
  linux-user: Add pidfd_open(), pidfd_send_signal() and pidfd_getfd()
syscalls
  linux-user: Log failing executable in EXCP_DUMP()
  linux-user/hppa: Use EXCP_DUMP() to show enhanced debug info
  linux-user/hppa: Dump IIR on register dump
  linux-user: Fix strace of chmod() if mode == 0
  linux-user/hppa: Set TASK_UNMAPPED_BASE to 0xfa00 for hppa arch
  linux-user: Add strace for clock_nanosleep()
  linux-user: Show timespec on strace for futex()
  linux-user: Add close_range() syscall
  linux-user: Add parameters of getrandom() syscall for strace

 linux-user/cpu_loop-common.h |   2 +
 linux-user/hppa/cpu_loop.c   |   6 +-
 linux-user/mmap.c|   4 +
 linux-user/signal-common.h   |  46 
 linux-user/signal.c  |  37 +
 linux-user/strace.c  | 142 ++-
 linux-user/strace.list   |  21 +-
 linux-user/syscall.c |  50 
 target/hppa/helper.c |   6 +-
 9 files changed, 255 insertions(+), 59 deletions(-)

--
2.37.3

[PATCH v2 35/37] tests/tcg: extend SSE tests to AVX

2022-09-20 Thread Paolo Bonzini

Extracted from a patch by Paul Brook .

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 tests/tcg/i386/Makefile.target |   2 +-
 tests/tcg/i386/test-avx.c  | 201 ++---
 tests/tcg/i386/test-avx.py |   5 +-
 3 files changed, 113 insertions(+), 95 deletions(-)

diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
index ae71e7f748..4139973255 100644
--- a/tests/tcg/i386/Makefile.target
+++ b/tests/tcg/i386/Makefile.target
@@ -98,5 +98,5 @@ test-3dnow: test-3dnow.h
 test-mmx: CFLAGS += -masm=intel -O -I.
 test-mmx: test-mmx.h
 
-test-avx: CFLAGS += -masm=intel -O -I.
+test-avx: CFLAGS += -mavx -masm=intel -O -I.
 test-avx: test-avx.h
diff --git a/tests/tcg/i386/test-avx.c b/tests/tcg/i386/test-avx.c
index 23c170dd79..953e2906fe 100644
--- a/tests/tcg/i386/test-avx.c
+++ b/tests/tcg/i386/test-avx.c
@@ -6,18 +6,18 @@
 typedef void (*testfn)(void);
 
 typedef struct {
-uint64_t q0, q1;
-} __attribute__((aligned(16))) v2di;
+uint64_t q0, q1, q2, q3;
+} __attribute__((aligned(32))) v4di;
 
 typedef struct {
 uint64_t mm[8];
-v2di xmm[16];
+v4di ymm[16];
 uint64_t r[16];
 uint64_t flags;
 uint32_t ff;
 uint64_t pad;
-v2di mem[4];
-v2di mem0[4];
+v4di mem[4];
+v4di mem0[4];
 } reg_state;
 
 typedef struct {
@@ -31,20 +31,20 @@ reg_state initI;
 reg_state initF32;
 reg_state initF64;
 
-static void dump_xmm(const char *name, int n, const v2di *r, int ff)
+static void dump_ymm(const char *name, int n, const v4di *r, int ff)
 {
-printf("%s%d = %016lx %016lx\n",
-   name, n, r->q1, r->q0);
+printf("%s%d = %016lx %016lx %016lx %016lx\n",
+   name, n, r->q3, r->q2, r->q1, r->q0);
 if (ff == 64) {
-double v[2];
+double v[4];
 memcpy(v, r, sizeof(v));
-printf("%16g %16g\n",
-v[1], v[0]);
-} else if (ff == 32) {
-float v[4];
-memcpy(v, r, sizeof(v));
-printf(" %8g %8g %8g %8g\n",
+printf("%16g %16g %16g %16g\n",
 v[3], v[2], v[1], v[0]);
+} else if (ff == 32) {
+float v[8];
+memcpy(v, r, sizeof(v));
+printf(" %8g %8g %8g %8g %8g %8g %8g %8g\n",
+v[7], v[6], v[5], v[4], v[3], v[2], v[1], v[0]);
 }
 }
 
@@ -53,10 +53,10 @@ static void dump_regs(reg_state *s)
 int i;
 
 for (i = 0; i < 16; i++) {
-dump_xmm("xmm", i, >xmm[i], 0);
+dump_ymm("ymm", i, >ymm[i], 0);
 }
 for (i = 0; i < 4; i++) {
-dump_xmm("mem", i, >mem0[i], 0);
+dump_ymm("mem", i, >mem0[i], 0);
 }
 }
 
@@ -74,13 +74,13 @@ static void compare_state(const reg_state *a, const 
reg_state *b)
 }
 }
 for (i = 0; i < 16; i++) {
-if (memcmp(>xmm[i], >xmm[i], 16)) {
-dump_xmm("xmm", i, >xmm[i], a->ff);
+if (memcmp(>ymm[i], >ymm[i], 32)) {
+dump_ymm("ymm", i, >ymm[i], a->ff);
 }
 }
 for (i = 0; i < 4; i++) {
-if (memcmp(>mem0[i], >mem[i], 16)) {
-dump_xmm("mem", i, >mem[i], a->ff);
+if (memcmp(>mem0[i], >mem[i], 32)) {
+dump_ymm("mem", i, >mem[i], a->ff);
 }
 }
 if (a->flags != b->flags) {
@@ -89,9 +89,9 @@ static void compare_state(const reg_state *a, const reg_state 
*b)
 }
 
 #define LOADMM(r, o) "movq " #r ", " #o "[%0]\n\t"
-#define LOADXMM(r, o) "movdqa " #r ", " #o "[%0]\n\t"
+#define LOADYMM(r, o) "vmovdqa " #r ", " #o "[%0]\n\t"
 #define STOREMM(r, o) "movq " #o "[%1], " #r "\n\t"
-#define STOREXMM(r, o) "movdqa " #o "[%1], " #r "\n\t"
+#define STOREYMM(r, o) "vmovdqa " #o "[%1], " #r "\n\t"
 #define MMREG(F) \
 F(mm0, 0x00) \
 F(mm1, 0x08) \
@@ -101,39 +101,39 @@ static void compare_state(const reg_state *a, const 
reg_state *b)
 F(mm5, 0x28) \
 F(mm6, 0x30) \
 F(mm7, 0x38)
-#define XMMREG(F) \
-F(xmm0, 0x040) \
-F(xmm1, 0x050) \
-F(xmm2, 0x060) \
-F(xmm3, 0x070) \
-F(xmm4, 0x080) \
-F(xmm5, 0x090) \
-F(xmm6, 0x0a0) \
-F(xmm7, 0x0b0) \
-F(xmm8, 0x0c0) \
-F(xmm9, 0x0d0) \
-F(xmm10, 0x0e0) \
-F(xmm11, 0x0f0) \
-F(xmm12, 0x100) \
-F(xmm13, 0x110) \
-F(xmm14, 0x120) \
-F(xmm15, 0x130)
+#define YMMREG(F) \
+F(ymm0, 0x040) \
+F(ymm1, 0x060) \
+F(ymm2, 0x080) \
+F(ymm3, 0x0a0) \
+F(ymm4, 0x0c0) \
+F(ymm5, 0x0e0) \
+F(ymm6, 0x100) \
+F(ymm7, 0x120) \
+F(ymm8, 0x140) \
+F(ymm9, 0x160) \
+F(ymm10, 0x180) \
+F(ymm11, 0x1a0) \
+F(ymm12, 0x1c0) \
+F(ymm13, 0x1e0) \
+F(ymm14, 0x200) \
+F(ymm15, 0x220)
 #define LOADREG(r, o) "mov " #r ", " #o "[rax]\n\t"
 #define STOREREG(r, o) "mov " #o "[rax], " #r "\n\t"
 #define REG(F) \
-F(rbx, 0x148) \
-F(rcx, 0x150) \
-F(rdx, 0x158) \
-F(rsi, 0x160) \
-F(rdi, 0x168) \
-F(r8, 0x180) \
-F(r9, 0x188) \
-F(r10, 0x190) \
-F(r11, 0x198) \
-F(r12, 0x1a0) \
-F(r13,

[PULL 14/30] tests/docker: flatten debian-powerpc-test-cross

2022-09-20 Thread Alex Bennée

Flatten into a single dockerfile. We really don't need the rest of the
stuff from the QEMU base image just to compile test images.

Signed-off-by: Alex Bennée 
Reviewed-by: Thomas Huth 
Message-Id: <20220914155950.804707-15-alex.ben...@linaro.org>

diff --git a/.gitlab-ci.d/container-cross.yml b/.gitlab-ci.d/container-cross.yml
index db0ea15d0d..67bbf19a27 100644
--- a/.gitlab-ci.d/container-cross.yml
+++ b/.gitlab-ci.d/container-cross.yml
@@ -102,7 +102,6 @@ mipsel-debian-cross-container:
 powerpc-test-cross-container:
   extends: .container_job_template
   stage: containers
-  needs: ['amd64-debian11-container']
   variables:
 NAME: debian-powerpc-test-cross
 
diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
index 8828b6b8fa..e034eca3af 100644
--- a/tests/docker/Makefile.include
+++ b/tests/docker/Makefile.include
@@ -137,7 +137,6 @@ docker-image-debian-all-test-cross: docker-image-debian10
 docker-image-debian-loongarch-cross: docker-image-debian11
 docker-image-debian-microblaze-cross: docker-image-debian10
 docker-image-debian-nios2-cross: docker-image-debian10
-docker-image-debian-powerpc-test-cross: docker-image-debian11
 docker-image-debian-riscv64-test-cross: docker-image-debian11
 
 # These images may be good enough for building tests but not for test builds
diff --git a/tests/docker/dockerfiles/debian-powerpc-test-cross.docker 
b/tests/docker/dockerfiles/debian-powerpc-test-cross.docker
index 36b336f709..d6b2909cc4 100644
--- a/tests/docker/dockerfiles/debian-powerpc-test-cross.docker
+++ b/tests/docker/dockerfiles/debian-powerpc-test-cross.docker
@@ -1,13 +1,15 @@
 #
 # Docker powerpc/ppc64/ppc64le cross-compiler target
 #
-# This docker target builds on the debian Bullseye base image.
+# This docker target builds on the Debian Bullseye base image.
 #
-FROM qemu/debian11
+FROM docker.io/library/debian:11-slim
 
-RUN apt update && \
-DEBIAN_FRONTEND=noninteractive eatmydata \
-apt install -y --no-install-recommends \
+RUN export DEBIAN_FRONTEND=noninteractive && \
+apt-get update && \
+apt-get install -y eatmydata && \
+eatmydata apt-get dist-upgrade -y && \
+eatmydata apt-get install --no-install-recommends -y \
 gcc-powerpc-linux-gnu \
 libc6-dev-powerpc-cross \
 gcc-10-powerpc64-linux-gnu \
-- 
2.34.1

[PATCH v2 29/37] target/i386: reimplement 0x0f 0xc2, 0xc4-0xc6, add AVX

2022-09-20 Thread Paolo Bonzini

Nothing special going on here, for once.

Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc |  5 +++
 target/i386/tcg/emit.c.inc   | 75 
 target/i386/tcg/translate.c  |  1 +
 3 files changed, 81 insertions(+)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index 798b423163..461921a98d 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -648,6 +648,11 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x7e] = X86_OP_GROUP0(0F7E),
 [0x7f] = X86_OP_GROUP0(0F7F),
 
+[0xc2] = X86_OP_ENTRY4(VCMP,   V,x, H,x, W,x,   vex2_rep3 
p_00_66_f3_f2),
+[0xc4] = X86_OP_ENTRY4(PINSRW, V,dq,H,dq,E,w,   vex5 mmx p_00_66),
+[0xc5] = X86_OP_ENTRY3(PEXTRW, G,d, U,dq,I,b,   vex5 mmx p_00_66),
+[0xc6] = X86_OP_ENTRY4(VSHUF,  V,x, H,x, W,x,   vex4 p_00_66),
+
 [0xd0] = X86_OP_ENTRY3(VADDSUB,   V,x, H,x, W,x,vex2 cpuid(SSE3) 
p_66_f2),
 [0xd1] = X86_OP_ENTRY3(PSRLW_r,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
 [0xd2] = X86_OP_ENTRY3(PSRLD_r,   V,x, H,x, W,x,vex4 mmx avx2_256 
p_00_66),
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index dd36a3544e..71b8fcbe24 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1342,6 +1342,11 @@ static void gen_PINSRB(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *decode
 gen_pinsr(s, env, decode, MO_8);
 }
 
+static void gen_PINSRW(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+gen_pinsr(s, env, decode, MO_16);
+}
+
 static void gen_PINSR(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
 gen_pinsr(s, env, decode, decode->op[2].ot);
@@ -1648,6 +1653,66 @@ static void gen_VAESIMC(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *decod
 gen_helper_aesimc_xmm(cpu_env, OP_PTR0, OP_PTR2);
 }
 
+/*
+ * 00 = v*ps Vps, Hps, Wpd
+ * 66 = v*pd Vpd, Hpd, Wps
+ * f3 = v*ss Vss, Hss, Wps
+ * f2 = v*sd Vsd, Hsd, Wps
+ */
+#define SSE_CMP(x) { \
+gen_helper_ ## x ## ps ## _xmm, gen_helper_ ## x ## pd ## _xmm, \
+gen_helper_ ## x ## ss, gen_helper_ ## x ## sd, \
+gen_helper_ ## x ## ps ## _ymm, gen_helper_ ## x ## pd ## _ymm}
+static const SSEFunc_0_eppp gen_helper_cmp_funcs[32][6] = {
+SSE_CMP(cmpeq),
+SSE_CMP(cmplt),
+SSE_CMP(cmple),
+SSE_CMP(cmpunord),
+SSE_CMP(cmpneq),
+SSE_CMP(cmpnlt),
+SSE_CMP(cmpnle),
+SSE_CMP(cmpord),
+
+SSE_CMP(cmpequ),
+SSE_CMP(cmpnge),
+SSE_CMP(cmpngt),
+SSE_CMP(cmpfalse),
+SSE_CMP(cmpnequ),
+SSE_CMP(cmpge),
+SSE_CMP(cmpgt),
+SSE_CMP(cmptrue),
+
+SSE_CMP(cmpeqs),
+SSE_CMP(cmpltq),
+SSE_CMP(cmpleq),
+SSE_CMP(cmpunords),
+SSE_CMP(cmpneqq),
+SSE_CMP(cmpnltq),
+SSE_CMP(cmpnleq),
+SSE_CMP(cmpords),
+
+SSE_CMP(cmpequs),
+SSE_CMP(cmpngeq),
+SSE_CMP(cmpngtq),
+SSE_CMP(cmpfalses),
+SSE_CMP(cmpnequs),
+SSE_CMP(cmpgeq),
+SSE_CMP(cmpgtq),
+SSE_CMP(cmptrues),
+};
+#undef SSE_CMP
+
+static void gen_VCMP(DisasContext *s, CPUX86State *env, X86DecodedInsn *decode)
+{
+int index = decode->immediate & (s->prefix & PREFIX_VEX ? 31 : 7);
+int b =
+s->prefix & PREFIX_REPZ  ? 2 /* ss */ :
+s->prefix & PREFIX_REPNZ ? 3 /* sd */ :
+!!(s->prefix & PREFIX_DATA) /* pd */ + (s->vex_l << 2);
+
+gen_helper_cmp_funcs[index][b](cpu_env, OP_PTR0, OP_PTR1, OP_PTR2);
+}
+
 static void gen_VCVTfp2fp(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
 gen_unary_fp_sse(s, env, decode,
@@ -1793,6 +1858,16 @@ static void gen_VROUNDSS(DisasContext *s, CPUX86State 
*env, X86DecodedInsn *deco
 gen_helper_roundss_xmm(cpu_env, OP_PTR0, OP_PTR1, OP_PTR2, imm);
 }
 
+static void gen_VSHUF(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+TCGv_i32 imm = tcg_constant_i32(decode->immediate);
+SSEFunc_0_pppi ps, pd, fn;
+ps = s->vex_l ? gen_helper_shufps_ymm : gen_helper_shufps_xmm;
+pd = s->vex_l ? gen_helper_shufpd_ymm : gen_helper_shufpd_xmm;
+fn = s->prefix & PREFIX_DATA ? pd : ps;
+fn(OP_PTR0, OP_PTR1, OP_PTR2, imm);
+}
+
 static void gen_VZEROALL(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
 {
 TCGv_ptr ptr = tcg_temp_new_ptr();
diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 32f937013f..eb7a4d0e4d 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -4697,6 +4697,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState 
*cpu)
 if (use_new &&
 (b == 0x138 || b == 0x13a ||
  (b >= 0x150 && b <= 0x17f) ||
+ b == 0x1c2 || (b >= 0x1c4 && b <= 0x1c6) ||
  (b >= 0x1d0 && b <= 0x1ff))) {
 disas_insn_new(s, cpu, b + 0x100);
 return s->pc;
-- 
2.37.2

[PATCH v2 31/37] target/i386: reimplement 0x0f 0x28-0x2f, add AVX

2022-09-20 Thread Paolo Bonzini

Here the code is a bit uglier due to the truncation and extension
of registers to and from 32-bit.  There is also a mistake in the
manual with respect to the size of the memory operand of CVTPS2PI
and CVTTPS2PI, reported by Ricky Zhou.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc |  56 +++
 target/i386/tcg/emit.c.inc   | 120 +++
 target/i386/tcg/translate.c  |   1 +
 3 files changed, 177 insertions(+)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index e0cd9e..63eb66ccc4 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -672,6 +672,53 @@ static void decode_0F16(DisasContext *s, CPUX86State *env, 
X86OpEntry *entry, ui
 }
 }
 
+static void decode_0F2A(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
+{
+static const X86OpEntry opcodes_0F2A[4] = {
+X86_OP_ENTRY3(CVTPI2Px,  V,x,  None,None, Q,q),
+X86_OP_ENTRY3(CVTPI2Px,  V,x,  None,None, Q,q),
+X86_OP_ENTRY3(VCVTSI2Sx, V,x,  H,x, E,y,vex3),
+X86_OP_ENTRY3(VCVTSI2Sx, V,x,  H,x, E,y,vex3),
+};
+*entry = *decode_by_prefix(s, opcodes_0F2A);
+}
+
+static void decode_0F2B(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
+{
+static const X86OpEntry opcodes_0F2B[4] = {
+X86_OP_ENTRY3(MOVDQ,  M,x,  None,None, V,x, vex4), /* MOVNTPS */
+X86_OP_ENTRY3(MOVDQ,  M,x,  None,None, V,x, vex4), /* MOVNTPD */
+X86_OP_ENTRY3(VMOVSS_st,  M,ss, None,None, V,x, vex4 cpuid(SSE4A)), /* 
MOVNTSS */
+X86_OP_ENTRY3(VMOVLPx_st, M,sd, None,None, V,x, vex4 cpuid(SSE4A)), /* 
MOVNTSD */
+};
+
+*entry = *decode_by_prefix(s, opcodes_0F2B);
+}
+
+static void decode_0F2C(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
+{
+static const X86OpEntry opcodes_0F2C[4] = {
+/* Listed as ps/pd in the manual, but CVTTPS2PI only reads 64-bit.  */
+X86_OP_ENTRY3(CVTTPx2PI,  P,q,  None,None, W,q),
+X86_OP_ENTRY3(CVTTPx2PI,  P,q,  None,None, W,dq),
+X86_OP_ENTRY3(VCVTTSx2SI, G,y,  None,None, W,ss, vex3),
+X86_OP_ENTRY3(VCVTTSx2SI, G,y,  None,None, W,sd, vex3),
+};
+*entry = *decode_by_prefix(s, opcodes_0F2C);
+}
+
+static void decode_0F2D(DisasContext *s, CPUX86State *env, X86OpEntry *entry, 
uint8_t *b)
+{
+static const X86OpEntry opcodes_0F2D[4] = {
+/* Listed as ps/pd in the manual, but CVTPS2PI only reads 64-bit.  */
+X86_OP_ENTRY3(CVTPx2PI,  P,q,  None,None, W,q),
+X86_OP_ENTRY3(CVTPx2PI,  P,q,  None,None, W,dq),
+X86_OP_ENTRY3(VCVTSx2SI, G,y,  None,None, W,ss, vex3),
+X86_OP_ENTRY3(VCVTSx2SI, G,y,  None,None, W,sd, vex3),
+};
+*entry = *decode_by_prefix(s, opcodes_0F2D);
+}
+
 static void decode_sse_unary(DisasContext *s, CPUX86State *env, X86OpEntry 
*entry, uint8_t *b)
 {
 if (!(s->prefix & (PREFIX_REPZ | PREFIX_REPNZ))) {
@@ -746,6 +793,15 @@ static const X86OpEntry opcodes_0F[256] = {
 [0x76] = X86_OP_ENTRY3(PCMPEQD,V,x, H,x, W,x,  vex4 mmx avx2_256 
p_00_66),
 [0x77] = X86_OP_GROUP0(0F77),
 
+[0x28] = X86_OP_ENTRY3(MOVDQ,  V,x,  None,None, W,x, vex1 p_00_66), /* 
MOVAPS */
+[0x29] = X86_OP_ENTRY3(MOVDQ,  W,x,  None,None, V,x, vex1 p_00_66), /* 
MOVAPS */
+[0x2A] = X86_OP_GROUP0(0F2A),
+[0x2B] = X86_OP_GROUP0(0F2B),
+[0x2C] = X86_OP_GROUP0(0F2C),
+[0x2D] = X86_OP_GROUP0(0F2D),
+[0x2E] = X86_OP_ENTRY3(VUCOMI, None,None, V,x, W,x,  vex4 p_00_66),
+[0x2F] = X86_OP_ENTRY3(VCOMI,  None,None, V,x, W,x,  vex4 p_00_66),
+
 [0x38] = X86_OP_GROUP0(0F38),
 [0x3a] = X86_OP_GROUP0(0F3A),
 
diff --git a/target/i386/tcg/emit.c.inc b/target/i386/tcg/emit.c.inc
index 381fdf0ae6..6e391e3598 100644
--- a/target/i386/tcg/emit.c.inc
+++ b/target/i386/tcg/emit.c.inc
@@ -1038,6 +1038,36 @@ static void gen_CRC32(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode)
 gen_helper_crc32(s->T0, s->tmp2_i32, s->T1, tcg_constant_i32(8 << ot));
 }
 
+static void gen_CVTPI2Px(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+gen_helper_enter_mmx(cpu_env);
+if (s->prefix & PREFIX_DATA) {
+gen_helper_cvtpi2pd(cpu_env, OP_PTR0, OP_PTR2);
+} else {
+gen_helper_cvtpi2ps(cpu_env, OP_PTR0, OP_PTR2);
+}
+}
+
+static void gen_CVTPx2PI(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+gen_helper_enter_mmx(cpu_env);
+if (s->prefix & PREFIX_DATA) {
+gen_helper_cvtpd2pi(cpu_env, OP_PTR0, OP_PTR2);
+} else {
+gen_helper_cvtps2pi(cpu_env, OP_PTR0, OP_PTR2);
+}
+}
+
+static void gen_CVTTPx2PI(DisasContext *s, CPUX86State *env, X86DecodedInsn 
*decode)
+{
+gen_helper_enter_mmx(cpu_env);
+if (s->prefix & PREFIX_DATA) {
+gen_helper_cvttpd2pi(cpu_env, OP_PTR0, OP_PTR2);
+} else {
+gen_helper_cvttps2pi(cpu_env, OP_PTR0, OP_PTR2);
+

[PULL 05/12] linux-user/hppa: Use EXCP_DUMP() to show enhanced debug info

2022-09-20 Thread Helge Deller

Enhance the hppa linux-user cpu_loop() to show more debugging info
on hard errors.

Signed-off-by: Helge Deller 
---
 linux-user/hppa/cpu_loop.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/linux-user/hppa/cpu_loop.c b/linux-user/hppa/cpu_loop.c
index 64263c3dc4..1ef3b46191 100644
--- a/linux-user/hppa/cpu_loop.c
+++ b/linux-user/hppa/cpu_loop.c
@@ -147,12 +147,15 @@ void cpu_loop(CPUHPPAState *env)
 force_sig_fault(TARGET_SIGSEGV, TARGET_SEGV_MAPERR, env->iaoq_f);
 break;
 case EXCP_ILL:
+EXCP_DUMP(env, "qemu: got CPU exception 0x%x - aborting\n", 
trapnr);
 force_sig_fault(TARGET_SIGILL, TARGET_ILL_ILLOPN, env->iaoq_f);
 break;
 case EXCP_PRIV_OPR:
+EXCP_DUMP(env, "qemu: got CPU exception 0x%x - aborting\n", 
trapnr);
 force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVOPC, env->iaoq_f);
 break;
 case EXCP_PRIV_REG:
+EXCP_DUMP(env, "qemu: got CPU exception 0x%x - aborting\n", 
trapnr);
 force_sig_fault(TARGET_SIGILL, TARGET_ILL_PRVREG, env->iaoq_f);
 break;
 case EXCP_OVERFLOW:
@@ -171,7 +174,8 @@ void cpu_loop(CPUHPPAState *env)
 /* just indicate that signals should be handled asap */
 break;
 default:
-g_assert_not_reached();
+EXCP_DUMP(env, "qemu: unhandled CPU exception 0x%x - aborting\n", 
trapnr);
+abort();
 }
 process_pending_signals(env);
 }
--
2.37.3

Re: [PATCH] i386: Fix KVM_CAP_ADJUST_CLOCK capability check

2022-09-20 Thread Marcelo Tosatti

On Tue, Sep 20, 2022 at 04:40:24PM +0200, Vitaly Kuznetsov wrote:
> KVM commit c68dc1b577ea ("KVM: x86: Report host tsc and realtime values in
> KVM_GET_CLOCK") broke migration of certain workloads, e.g. Win11 + WSL2
> guest reboots immediately after migration. KVM, however, is not to
> blame this time. When KVM_CAP_ADJUST_CLOCK capability is checked, the
> result is all supported flags (which the above mentioned KVM commit
> enhanced) but kvm_has_adjust_clock_stable() wants it to be
> KVM_CLOCK_TSC_STABLE precisely. The result is that 'clock_is_reliable'
> is not set in vmstate and the saved clock reading is discarded in
> kvmclock_vm_state_change().
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
>  target/i386/kvm/kvm.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
> index a1fd1f53791d..c33192a87dcb 100644
> --- a/target/i386/kvm/kvm.c
> +++ b/target/i386/kvm/kvm.c
> @@ -157,7 +157,7 @@ bool kvm_has_adjust_clock_stable(void)
>  {
>  int ret = kvm_check_extension(kvm_state, KVM_CAP_ADJUST_CLOCK);
>  
> -return (ret == KVM_CLOCK_TSC_STABLE);
> +return ret & KVM_CLOCK_TSC_STABLE;
>  }
>  
>  bool kvm_has_adjust_clock(void)
> -- 
> 2.37.3
> 
> 

ACK.

[PATCH v2 34/37] target/i386: Enable AVX cpuid bits when using TCG

2022-09-20 Thread Paolo Bonzini

From: Paul Brook 

Include AVX, AVX2 and VAES in the guest cpuid features supported by TCG.

Signed-off-by: Paul Brook 
Message-Id: <20220424220204.2493824-40-p...@nowt.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/cpu.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1db1278a59..ec0817a61d 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -625,12 +625,12 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
   CPUID_EXT_SSE41 | CPUID_EXT_SSE42 | CPUID_EXT_POPCNT | \
   CPUID_EXT_XSAVE | /* CPUID_EXT_OSXSAVE is dynamic */   \
   CPUID_EXT_MOVBE | CPUID_EXT_AES | CPUID_EXT_HYPERVISOR | \
-  CPUID_EXT_RDRAND)
+  CPUID_EXT_RDRAND | CPUID_EXT_AVX)
   /* missing:
   CPUID_EXT_DTES64, CPUID_EXT_DSCPL, CPUID_EXT_VMX, CPUID_EXT_SMX,
   CPUID_EXT_EST, CPUID_EXT_TM2, CPUID_EXT_CID, CPUID_EXT_FMA,
   CPUID_EXT_XTPR, CPUID_EXT_PDCM, CPUID_EXT_PCID, CPUID_EXT_DCA,
-  CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER, CPUID_EXT_AVX,
+  CPUID_EXT_X2APIC, CPUID_EXT_TSC_DEADLINE_TIMER,
   CPUID_EXT_F16C */
 
 #ifdef TARGET_X86_64
@@ -653,14 +653,14 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendor1,
   CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ADX | \
   CPUID_7_0_EBX_PCOMMIT | CPUID_7_0_EBX_CLFLUSHOPT |\
   CPUID_7_0_EBX_CLWB | CPUID_7_0_EBX_MPX | CPUID_7_0_EBX_FSGSBASE | \
-  CPUID_7_0_EBX_ERMS)
+  CPUID_7_0_EBX_ERMS | CPUID_7_0_EBX_AVX2)
   /* missing:
-  CPUID_7_0_EBX_HLE, CPUID_7_0_EBX_AVX2,
+  CPUID_7_0_EBX_HLE
   CPUID_7_0_EBX_INVPCID, CPUID_7_0_EBX_RTM,
   CPUID_7_0_EBX_RDSEED */
 #define TCG_7_0_ECX_FEATURES (CPUID_7_0_ECX_UMIP | CPUID_7_0_ECX_PKU | \
   /* CPUID_7_0_ECX_OSPKE is dynamic */ \
-  CPUID_7_0_ECX_LA57 | CPUID_7_0_ECX_PKS)
+  CPUID_7_0_ECX_LA57 | CPUID_7_0_ECX_PKS | CPUID_7_0_ECX_VAES)
 #define TCG_7_0_EDX_FEATURES 0
 #define TCG_7_1_EAX_FEATURES 0
 #define TCG_APM_FEATURES 0
-- 
2.37.2

[PATCH v2 28/37] target/i386: reimplement 0x0f 0x38, add AVX

2022-09-20 Thread Paolo Bonzini

There are several special cases here:

1) extending moves have different widths for the helpers vs. for the
memory loads, and the width for memory loads depends on VEX.L too.
This is represented by X86_SPECIAL_AVXExtMov.

2) some instructions, such as variable-width shifts, select the vector element
size via REX.W.

3) VSIB instructions (VGATHERxPy, VPGATHERxy) are also part of this group,
and they have (among other things) two output operands.

3) the macros for 4-operand blends (which are under 0x0f 0x3a) have to be
extended to support 2-operand blends.  The 2-operand variant actually
came a few years earlier, but it is clearer to implement them in the
opposite order.

X86_TYPE_WM, introduced earlier for unaligned loads, is reused for helpers
that accept a Reg* but have a M argument.

These three-byte opcodes also include AVX new instructions, for which
the helpers were originally implemented by Paul Brook .

Signed-off-by: Paolo Bonzini 
---
 target/i386/ops_sse.h| 188 ++-
 target/i386/ops_sse_header.h |  19 +++
 target/i386/tcg/decode-new.c.inc | 112 -
 target/i386/tcg/decode-new.h |   6 +
 target/i386/tcg/emit.c.inc   | 210 ++-
 target/i386/tcg/translate.c  |   2 +-
 6 files changed, 529 insertions(+), 8 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index cb8909adcf..104a53fda0 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -2382,6 +2382,36 @@ void glue(helper_aeskeygenassist, SUFFIX)(CPUX86State 
*env, Reg *d, Reg *s,
 #endif
 
 #if SHIFT >= 1
+void glue(helper_vpermilpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+uint64_t r0, r1;
+int i;
+
+for (i = 0; i < 1 << SHIFT; i += 2) {
+r0 = v->Q(i + ((s->Q(i) >> 1) & 1));
+r1 = v->Q(i + ((s->Q(i+1) >> 1) & 1));
+d->Q(i) = r0;
+d->Q(i+1) = r1;
+}
+}
+
+void glue(helper_vpermilps, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+uint32_t r0, r1, r2, r3;
+int i;
+
+for (i = 0; i < 2 << SHIFT; i += 4) {
+r0 = v->L(i + (s->L(i) & 3));
+r1 = v->L(i + (s->L(i+1) & 3));
+r2 = v->L(i + (s->L(i+2) & 3));
+r3 = v->L(i + (s->L(i+3) & 3));
+d->L(i) = r0;
+d->L(i+1) = r1;
+d->L(i+2) = r2;
+d->L(i+3) = r3;
+}
+}
+
 void glue(helper_vpermilpd_imm, SUFFIX)(Reg *d, Reg *s, uint32_t order)
 {
 uint64_t r0, r1;
@@ -2414,6 +2444,150 @@ void glue(helper_vpermilps_imm, SUFFIX)(Reg *d, Reg *s, 
uint32_t order)
 }
 }
 
+#if SHIFT == 1
+#define FPSRLVD(x, c) (c < 32 ? ((x) >> c) : 0)
+#define FPSRLVQ(x, c) (c < 64 ? ((x) >> c) : 0)
+#define FPSRAVD(x, c) ((int32_t)(x) >> (c < 64 ? c : 31))
+#define FPSRAVQ(x, c) ((int64_t)(x) >> (c < 64 ? c : 63))
+#define FPSLLVD(x, c) (c < 32 ? ((x) << c) : 0)
+#define FPSLLVQ(x, c) (c < 64 ? ((x) << c) : 0)
+#endif
+
+SSE_HELPER_L(helper_vpsrlvd, FPSRLVD)
+SSE_HELPER_L(helper_vpsravd, FPSRAVD)
+SSE_HELPER_L(helper_vpsllvd, FPSLLVD)
+
+SSE_HELPER_Q(helper_vpsrlvq, FPSRLVQ)
+SSE_HELPER_Q(helper_vpsravq, FPSRAVQ)
+SSE_HELPER_Q(helper_vpsllvq, FPSLLVQ)
+
+void glue(helper_vtestps, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+uint32_t zf = 0, cf = 0;
+int i;
+
+for (i = 0; i < 2 << SHIFT; i++) {
+zf |= (s->L(i) &  d->L(i));
+cf |= (s->L(i) & ~d->L(i));
+}
+CC_SRC = ((zf >> 31) ? 0 : CC_Z) | ((cf >> 31) ? 0 : CC_C);
+}
+
+void glue(helper_vtestpd, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+uint64_t zf = 0, cf = 0;
+int i;
+
+for (i = 0; i < 1 << SHIFT; i++) {
+zf |= (s->Q(i) &  d->Q(i));
+cf |= (s->Q(i) & ~d->Q(i));
+}
+CC_SRC = ((zf >> 63) ? 0 : CC_Z) | ((cf >> 63) ? 0 : CC_C);
+}
+
+void glue(helper_vpmaskmovd_st, SUFFIX)(CPUX86State *env,
+Reg *v, Reg *s, target_ulong a0)
+{
+int i;
+
+for (i = 0; i < (2 << SHIFT); i++) {
+if (v->L(i) >> 31) {
+cpu_stl_data_ra(env, a0 + i * 4, s->L(i), GETPC());
+}
+}
+}
+
+void glue(helper_vpmaskmovq_st, SUFFIX)(CPUX86State *env,
+Reg *v, Reg *s, target_ulong a0)
+{
+int i;
+
+for (i = 0; i < (1 << SHIFT); i++) {
+if (v->Q(i) >> 63) {
+cpu_stq_data_ra(env, a0 + i * 8, s->Q(i), GETPC());
+}
+}
+}
+
+void glue(helper_vpmaskmovd, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+int i;
+
+for (i = 0; i < (2 << SHIFT); i++) {
+d->L(i) = (v->L(i) >> 31) ? s->L(i) : 0;
+}
+}
+
+void glue(helper_vpmaskmovq, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)
+{
+int i;
+
+for (i = 0; i < (1 << SHIFT); i++) {
+d->Q(i) = (v->Q(i) >> 63) ? s->Q(i) : 0;
+}
+}
+
+void glue(helper_vpgatherdd, SUFFIX)(CPUX86State *env,
+Reg *d, Reg *v, Reg *s, target_ulong a0, unsigned scale)
+{
+int i;
+for (i = 0; i < (2 << SHIFT); i++) {
+if (v->L(i) >>

[PULL 02/12] linux-user: Add missing clock_gettime64() syscall strace

2022-09-20 Thread Helge Deller

Allow linux-user to strace the clock_gettime64() syscall.
This syscall is used a lot on 32-bit guest architectures which use newer
glibc versions.

Signed-off-by: Helge Deller 
---
 linux-user/strace.c| 53 ++
 linux-user/strace.list |  4 
 2 files changed, 57 insertions(+)

diff --git a/linux-user/strace.c b/linux-user/strace.c
index a4eeef7ae1..816e679995 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -82,6 +82,7 @@ UNUSED static void print_buf(abi_long addr, abi_long len, int 
last);
 UNUSED static void print_raw_param(const char *, abi_long, int);
 UNUSED static void print_timeval(abi_ulong, int);
 UNUSED static void print_timespec(abi_ulong, int);
+UNUSED static void print_timespec64(abi_ulong, int);
 UNUSED static void print_timezone(abi_ulong, int);
 UNUSED static void print_itimerval(abi_ulong, int);
 UNUSED static void print_number(abi_long, int);
@@ -795,6 +796,24 @@ print_syscall_ret_clock_gettime(CPUArchState *cpu_env, 
const struct syscallname
 #define print_syscall_ret_clock_getres print_syscall_ret_clock_gettime
 #endif

+#if defined(TARGET_NR_clock_gettime64)
+static void
+print_syscall_ret_clock_gettime64(CPUArchState *cpu_env, const struct 
syscallname *name,
+abi_long ret, abi_long arg0, abi_long arg1,
+abi_long arg2, abi_long arg3, abi_long arg4,
+abi_long arg5)
+{
+if (!print_syscall_err(ret)) {
+qemu_log(TARGET_ABI_FMT_ld, ret);
+qemu_log(" (");
+print_timespec64(arg1, 1);
+qemu_log(")");
+}
+
+qemu_log("\n");
+}
+#endif
+
 #ifdef TARGET_NR_gettimeofday
 static void
 print_syscall_ret_gettimeofday(CPUArchState *cpu_env, const struct syscallname 
*name,
@@ -1652,6 +1671,27 @@ print_timespec(abi_ulong ts_addr, int last)
 }
 }

+static void
+print_timespec64(abi_ulong ts_addr, int last)
+{
+if (ts_addr) {
+struct target__kernel_timespec *ts;
+
+ts = lock_user(VERIFY_READ, ts_addr, sizeof(*ts), 1);
+if (!ts) {
+print_pointer(ts_addr, last);
+return;
+}
+qemu_log("{tv_sec = %lld"
+ ",tv_nsec = %lld}%s",
+ (long long)tswap64(ts->tv_sec), (long 
long)tswap64(ts->tv_nsec),
+ get_comma(last));
+unlock_user(ts, ts_addr, 0);
+} else {
+qemu_log("NULL%s", get_comma(last));
+}
+}
+
 static void
 print_timezone(abi_ulong tz_addr, int last)
 {
@@ -2267,6 +2307,19 @@ print_clock_gettime(CPUArchState *cpu_env, const struct 
syscallname *name,
 #define print_clock_getres print_clock_gettime
 #endif

+#if defined(TARGET_NR_clock_gettime64)
+static void
+print_clock_gettime64(CPUArchState *cpu_env, const struct syscallname *name,
+abi_long arg0, abi_long arg1, abi_long arg2,
+abi_long arg3, abi_long arg4, abi_long arg5)
+{
+print_syscall_prologue(name);
+print_enums(clockids, arg0, 0);
+print_pointer(arg1, 1);
+print_syscall_epilogue(name);
+}
+#endif
+
 #ifdef TARGET_NR_clock_settime
 static void
 print_clock_settime(CPUArchState *cpu_env, const struct syscallname *name,
diff --git a/linux-user/strace.list b/linux-user/strace.list
index 72e17b1acf..a78cdf3cdf 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -1676,3 +1676,7 @@
 #ifdef TARGET_NR_copy_file_range
 { TARGET_NR_copy_file_range, "copy_file_range", 
"%s(%d,%p,%d,%p,"TARGET_ABI_FMT_lu",%u)", NULL, NULL },
 #endif
+#ifdef TARGET_NR_clock_gettime64
+{ TARGET_NR_clock_gettime64, "clock_gettime64" , NULL, print_clock_gettime64,
+   print_syscall_ret_clock_gettime64 },
+#endif
--
2.37.3

[PATCH v2 14/37] target/i386: extend helpers to support VEX.V 3- and 4- operand encodings

2022-09-20 Thread Paolo Bonzini

Add to the helpers all the operands that are needed to implement AVX.

Extracted from a patch by Paul Brook .

Message-Id: <20220424220204.2493824-26-p...@nowt.org>
Reviewed-by: Richard Henderson 
Signed-off-by: Paolo Bonzini 
---
 target/i386/ops_sse.h| 173 +
 target/i386/ops_sse_header.h | 149 ++--
 target/i386/tcg/translate.c  | 181 ---
 3 files changed, 265 insertions(+), 238 deletions(-)

diff --git a/target/i386/ops_sse.h b/target/i386/ops_sse.h
index 7bf8bb967d..5f0ee9db52 100644
--- a/target/i386/ops_sse.h
+++ b/target/i386/ops_sse.h
@@ -48,9 +48,8 @@
 #define FPSLL(x, c) ((x) << shift)
 #endif
 
-void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 15) {
 for (int i = 0; i < 1 << SHIFT; i++) {
@@ -64,9 +63,8 @@ void glue(helper_psrlw, SUFFIX)(CPUX86State *env, Reg *d, Reg 
*c)
 }
 }
 
-void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 15) {
 for (int i = 0; i < 1 << SHIFT; i++) {
@@ -80,9 +78,8 @@ void glue(helper_psllw, SUFFIX)(CPUX86State *env, Reg *d, Reg 
*c)
 }
 }
 
-void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 15) {
 shift = 15;
@@ -94,9 +91,8 @@ void glue(helper_psraw, SUFFIX)(CPUX86State *env, Reg *d, Reg 
*c)
 }
 }
 
-void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 31) {
 for (int i = 0; i < 1 << SHIFT; i++) {
@@ -110,9 +106,8 @@ void glue(helper_psrld, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *c)
 }
 }
 
-void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 31) {
 for (int i = 0; i < 1 << SHIFT; i++) {
@@ -126,9 +121,8 @@ void glue(helper_pslld, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *c)
 }
 }
 
-void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 31) {
 shift = 31;
@@ -140,9 +134,8 @@ void glue(helper_psrad, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *c)
 }
 }
 
-void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 63) {
 for (int i = 0; i < 1 << SHIFT; i++) {
@@ -156,9 +149,8 @@ void glue(helper_psrlq, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *c)
 }
 }
 
-void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift;
 if (c->Q(0) > 63) {
 for (int i = 0; i < 1 << SHIFT; i++) {
@@ -173,9 +165,8 @@ void glue(helper_psllq, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *c)
 }
 
 #if SHIFT >= 1
-void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift, i, j;
 
 shift = c->L(0);
@@ -192,9 +183,8 @@ void glue(helper_psrldq, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *c)
 }
 }
 
-void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *c)
+void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, Reg *s, Reg *c)
 {
-Reg *s = d;
 int shift, i, j;
 
 shift = c->L(0);
@@ -222,9 +212,8 @@ void glue(helper_pslldq, SUFFIX)(CPUX86State *env, Reg *d, 
Reg *c)
 }
 
 #define SSE_HELPER_2(name, elem, num, F)\
-void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)   \
+void glue(name, SUFFIX)(CPUX86State *env, Reg *d, Reg *v, Reg *s)   \
 {   \
-Reg *v = d; \
 int n = num;\
 for (int i = 0; i < n; i++) {   \
 d->elem(i) = F(v->elem(i), s->elem(i)); \
@@ -362,18 +351,24 @@ SSE_HELPER_W(helper_pcmpeqw, FCMPEQ)
 SSE_HELPER_L(helper_pcmpeql, FCMPEQ)
 
 SSE_HELPER_W(helper_pmullw, FMULLW)
-#if SHIFT == 0
-SSE_HELPER_W(helper_pmulhrw, FMULHRW)
-#endif
 SSE_HELPER_W(helper_pmulhuw, FMULHUW)
 SSE_HELPER_W(helper_pmulhw, FMULHW)
 
+#if SHIFT == 0
+void glue(helper_pmulhrw, SUFFIX)(CPUX86State *env, Reg *d, Reg *s)
+{
+d->W(0) =

[PULL 11/12] linux-user: Add close_range() syscall

2022-09-20 Thread Helge Deller

Signed-off-by: Helge Deller 
---
 linux-user/strace.list |  3 +++
 linux-user/syscall.c   | 16 
 2 files changed, 19 insertions(+)

diff --git a/linux-user/strace.list b/linux-user/strace.list
index 215d971b2a..ad9ef94689 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -103,6 +103,9 @@
 #ifdef TARGET_NR_close
 { TARGET_NR_close, "close" , "%s(%d)", NULL, NULL },
 #endif
+#ifdef TARGET_NR_close_range
+{ TARGET_NR_close_range, "close_range" , "%s(%d,%d,%d)", NULL, NULL },
+#endif
 #ifdef TARGET_NR_connect
 { TARGET_NR_connect, "connect" , "%s(%d,%#x,%d)", NULL, NULL },
 #endif
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ca39acfceb..2e0e974562 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -338,6 +338,10 @@ _syscall3(int,sys_syslog,int,type,char*,bufp,int,len)
 #ifdef __NR_exit_group
 _syscall1(int,exit_group,int,error_code)
 #endif
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+#define __NR_sys_close_range __NR_close_range
+_syscall3(int,sys_close_range,int,first,int,last,int,flags)
+#endif
 #if defined(__NR_futex)
 _syscall6(int,sys_futex,int *,uaddr,int,op,int,val,
   const struct timespec *,timeout,int *,uaddr2,int,val3)
@@ -8721,6 +8725,18 @@ static abi_long do_syscall1(CPUArchState *cpu_env, int 
num, abi_long arg1,
 case TARGET_NR_close:
 fd_trans_unregister(arg1);
 return get_errno(close(arg1));
+#if defined(__NR_close_range) && defined(TARGET_NR_close_range)
+case TARGET_NR_close_range:
+{
+abi_long fd;
+abi_long maxfd = (arg2 == (abi_long)-1) ? target_fd_max : arg2;
+
+for (fd = arg1; fd <= maxfd; fd++) {
+fd_trans_unregister(fd);
+}
+}
+return get_errno(sys_close_range(arg1, arg2, arg3));
+#endif

 case TARGET_NR_brk:
 return do_brk(arg1);
--
2.37.3

[PATCH v2 3/3] hw/arm/aspeed: g220a: Add a latching switch device

2022-09-20 Thread Jian Zhang

Add a latching switch device connect between g220a BMC machind(soc
gpio) as host-power.

The latching switch device default state is off and trigger edge is
falling edge.

Tested:
In qemu, use g220a image

~# ipmitool power status
Chassis Power is off

~# ipmitool power on
Chassis Power Control: Up/On

~# ipmitool power status
Chassis Power is on

~# ipmitool power off
Chassis Power Control: Down/Off

~# ipmitool power status
Chassis Power is off

Signed-off-by: Jian Zhang 
---
 hw/arm/Kconfig  |  1 +
 hw/arm/aspeed.c | 20 
 2 files changed, 21 insertions(+)

diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig
index 15fa79afd3..f2455db5a0 100644
--- a/hw/arm/Kconfig
+++ b/hw/arm/Kconfig
@@ -457,6 +457,7 @@ config ASPEED_SOC
 select LED
 select PMBUS
 select MAX31785
+select LATCHING_SWITCH
 
 config MPS2
 bool
diff --git a/hw/arm/aspeed.c b/hw/arm/aspeed.c
index bc3ecdb619..070de3aeff 100644
--- a/hw/arm/aspeed.c
+++ b/hw/arm/aspeed.c
@@ -27,6 +27,7 @@
 #include "qemu/units.h"
 #include "hw/qdev-clock.h"
 #include "sysemu/sysemu.h"
+#include "hw/misc/latching_switch.h"
 
 static struct arm_boot_info aspeed_board_binfo = {
 .board_id = -1, /* device-tree-only board */
@@ -666,6 +667,25 @@ static void g220a_bmc_i2c_init(AspeedMachineState *bmc)
 };
 smbus_eeprom_init_one(aspeed_i2c_get_bus(>i2c, 4), 0x57,
   eeprom_buf);
+
+/* Add a host-power device */
+LatchingSwitchState *power =
+latching_switch_create_simple(OBJECT(bmc),
+  false, TRIGGER_EDGE_FALLING);
+
+/*
+ * connect the input to soc(out, power button)
+ * the power button in g220a is 215
+ */
+qdev_connect_gpio_out(DEVICE(>soc.gpio), 215,
+  qdev_get_gpio_in(DEVICE(power), 0));
+
+/*
+ * connect the output to soc(in, power good signal)
+ * the power good in g220a is 209
+ */
+qdev_connect_gpio_out(DEVICE(power), 0,
+  qdev_get_gpio_in(DEVICE(>soc.gpio), 209));
 }
 
 static void aspeed_eeprom_init(I2CBus *bus, uint8_t addr, uint32_t rsize)
-- 
2.25.1

[PATCH v2 11/37] target/i386: validate SSE prefixes directly in the decoding table

2022-09-20 Thread Paolo Bonzini

Many SSE and AVX instructions are only valid with specific prefixes
(none, 66, F3, F2).  Introduce a direct way to encode this in the
decoding table to avoid using decode groups too much.

Signed-off-by: Paolo Bonzini 
---
 target/i386/tcg/decode-new.c.inc | 37 
 target/i386/tcg/decode-new.h |  1 +
 2 files changed, 38 insertions(+)

diff --git a/target/i386/tcg/decode-new.c.inc b/target/i386/tcg/decode-new.c.inc
index f56c654e08..4dc67e6d37 100644
--- a/target/i386/tcg/decode-new.c.inc
+++ b/target/i386/tcg/decode-new.c.inc
@@ -110,6 +110,22 @@
 
 #define avx2_256 .vex_special = X86_VEX_AVX2_256,
 
+#define P_00  1
+#define P_66  (1 << PREFIX_DATA)
+#define P_F3  (1 << PREFIX_REPZ)
+#define P_F2  (1 << PREFIX_REPNZ)
+
+#define p_00  .valid_prefix = P_00,
+#define p_66  .valid_prefix = P_66,
+#define p_f3  .valid_prefix = P_F3,
+#define p_f2  .valid_prefix = P_F2,
+#define p_00_66   .valid_prefix = P_00 | P_66,
+#define p_00_f3   .valid_prefix = P_00 | P_F3,
+#define p_66_f2   .valid_prefix = P_66 | P_F2,
+#define p_00_66_f3.valid_prefix = P_00 | P_66 | P_F3,
+#define p_66_f3_f2.valid_prefix = P_66 | P_F3 | P_F2,
+#define p_00_66_f3_f2 .valid_prefix = P_00 | P_66 | P_F3 | P_F2,
+
 static uint8_t get_modrm(DisasContext *s, CPUX86State *env)
 {
 if (!s->has_modrm) {
@@ -480,6 +496,23 @@ static bool decode_op(DisasContext *s, CPUX86State *env, 
X86DecodedInsn *decode,
 return true;
 }
 
+static bool validate_sse_prefix(DisasContext *s, X86OpEntry *e)
+{
+uint16_t sse_prefixes;
+
+if (!e->valid_prefix) {
+return true;
+}
+if (s->prefix & (PREFIX_REPZ | PREFIX_REPNZ)) {
+/* In SSE instructions, 0xF3 and 0xF2 cancel 0x66.  */
+s->prefix &= ~PREFIX_DATA;
+}
+
+/* Now, either zero or one bit is set in sse_prefixes.  */
+sse_prefixes = s->prefix & (PREFIX_REPZ | PREFIX_REPNZ | PREFIX_DATA);
+return e->valid_prefix & (1 << sse_prefixes);
+}
+
 static bool decode_insn(DisasContext *s, CPUX86State *env, X86DecodeFunc 
decode_func,
 X86DecodedInsn *decode)
 {
@@ -491,6 +524,10 @@ static bool decode_insn(DisasContext *s, CPUX86State *env, 
X86DecodeFunc decode_
 e->decode(s, env, e, >b);
 }
 
+if (!validate_sse_prefix(s, e)) {
+return false;
+}
+
 /* First compute size of operands in order to initialize s->rip_offset.  */
 if (e->op0 != X86_TYPE_None) {
 if (!decode_op_size(s, e, e->s0, >op[0].ot)) {
diff --git a/target/i386/tcg/decode-new.h b/target/i386/tcg/decode-new.h
index 8431057769..5fb68a365c 100644
--- a/target/i386/tcg/decode-new.h
+++ b/target/i386/tcg/decode-new.h
@@ -212,6 +212,7 @@ struct X86OpEntry {
 X86CPUIDFeature cpuid:8;
 uint8_t  vex_class:8;
 X86VEXSpecial vex_special:8;
+uint16_t valid_prefix:16;
 bool is_decode:1;
 };
 
-- 
2.37.2

1 2 3 4 >

1 - 100 of 371 matches

Mail list logo