date:20240517

Re: [PATCH v2 2/2] iotests: test NBD+TLS+iothread

2024-05-17 Thread Eric Blake

Adding a bit of self-review (in case you want to amend this before
pushing, instead of waiting for me to get back online),

On Fri, May 17, 2024 at 09:50:15PM GMT, Eric Blake wrote:
> Prevent regressions when using NBD with TLS in the presence of
> iothreads, adding coverage the fix to qio channels made in the
> previous patch.
> 
> CC: qemu-sta...@nongnu.org
> Signed-off-by: Eric Blake 
> ---
>  tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++
>  tests/qemu-iotests/tests/nbd-tls-iothread.out |  54 ++
>  2 files changed, 224 insertions(+)
>  create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread
>  create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out
> 
> diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread 
> b/tests/qemu-iotests/tests/nbd-tls-iothread
> new file mode 100755
> index 000..a737224a90e
> --- /dev/null
> +++ b/tests/qemu-iotests/tests/nbd-tls-iothread
> @@ -0,0 +1,170 @@
> +#!/usr/bin/env bash
> +# group: rw quick
> +#
> +# Test of NBD+TLS+iothread
> +#
> +# Copyright (C) 2024 Red Hat, Inc.
> +#
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with this program.  If not, see .
> +#
> +
> +# creator
> +owner=ebl...@redhat.com
> +
> +seq=`basename $0`
> +echo "QA output created by $seq"
> +
> +status=1# failure is the default!
> +
> +_cleanup()
> +{
> +_cleanup_qemu
> +_cleanup_test_img
> +rm -f "$dst_image"
> +tls_x509_cleanup
> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +cd ..
> +. ./common.rc
> +. ./common.filter
> +. ./common.qemu
> +. ./common.tls
> +. ./common.nbd
> +
> +_supported_fmt qcow2  # Hardcoded to qcow2 command line and QMP below
> +_supported_proto file
> +_require_command QEMU_NBD

This line can probably be dropped.  I originally included it thinking
I might reuse common.nbd's nbd_server_start_tcp_socket to pick an
unused port via a throwaway qemu-nbd, then kill the qemu-nbd process
before starting up the two qemu processes.  But in the end, using ss
to probe a port's use seems a bit more elegant than a throwaway
qemu-nbd process, although it may make CI testing harder by dragging
in another dependency that is less universal.

> +
> +# pick_unused_port
> +# Copied from nbdkit/tests/functions.sh.in with compatible 2-clause BSD 
> license

I'm not sure if I have to include the license text verbatim in this
file, and/or have this function moved to a helper utility file.  The
original source file that I borrowed pick_unused_port from has:

# Copyright Red Hat
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
#
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
#
# * Redistributions in binary form must reproduce the above copyright
# notice, this list of conditions and the following disclaimer in the
# documentation and/or other materials provided with the distribution.
#
# * Neither the name of Red Hat nor the names of its contributors may be
# used to endorse or promote products derived from this software without
# specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY RED HAT AND CONTRIBUTORS ''AS IS'' AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
# THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
# PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL RED HAT OR
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
# USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
# OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
# SUCH DAMAGE.

> +#
> +# Picks and returns an "unused" port, setting the global variable
> +# $port.
> +#
> +# This is inherently racy, but we need it because qemu does not currently
> +# permit NBD+TLS over a Unix domain socket
> +pick_unused_port ()
> +{
> +if ! (ss --version) >/dev/null 2>&1; then
> +_notrun "ss utility required, skipped this test"
> +fi
> +
> +# Start at a random port to make it

Re: [PATCH v2 0/2] Fix NBD+TLS regression in presence of iothread

2024-05-17 Thread Eric Blake

On Fri, May 17, 2024 at 09:50:13PM GMT, Eric Blake wrote:
> In v2:
> - correct list email address
> - add iotest
> - add R-b
> 
> I'm offline next week, and have been communicating with Stefan who may
> want to push this through his block tree instead of waiting for me to
> get back.

I also meant to add that I did test that the iotest 2/2 fails unless
1/2 is applied.

> 
> Eric Blake (2):
>   qio: Inherit follow_coroutine_ctx across TLS
>   iotests: test NBD+TLS+iothread
> 
>  io/channel-tls.c  |  26 +--
>  io/channel-websock.c  |   1 +
>  tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++
>  tests/qemu-iotests/tests/nbd-tls-iothread.out |  54 ++
>  4 files changed, 240 insertions(+), 11 deletions(-)
>  create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread
>  create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out
> 
> -- 
> 2.45.0
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

[PATCH v2 1/2] qio: Inherit follow_coroutine_ctx across TLS

2024-05-17 Thread Eric Blake

Since qemu 8.2, the combination of NBD + TLS + iothread crashes on an
assertion failure:

qemu-kvm: ../io/channel.c:534: void qio_channel_restart_read(void *): Assertion 
`qemu_get_current_aio_context() == qemu_coroutine_get_aio_context(co)' failed.

It turns out that when we removed AioContext locking, we did so by
having NBD tell its qio channels that it wanted to opt in to
qio_channel_set_follow_coroutine_ctx(); but while we opted in on the
main channel, we did not opt in on the TLS wrapper channel.
qemu-iotests has coverage of NBD+iothread and NBD+TLS, but apparently
no coverage of NBD+TLS+iothread, or we would have noticed this
regression sooner.  (I'll add that in the next patch)

But while we could manually opt in to the TLS channel in nbd/server.c
(a one-line change), it is more generic if all qio channels that wrap
other channels inherit the follow status, in the same way that they
inherit feature bits.

CC: Stefan Hajnoczi 
CC: Daniel P. Berrangé 
CC: qemu-sta...@nongnu.org
Fixes: https://issues.redhat.com/browse/RHEL-34786
Fixes: 06e0f098 ("io: follow coroutine AioContext in qio_channel_yield()", 
v8.2.0)
Signed-off-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Daniel P. Berrangé 

---
 io/channel-tls.c | 26 +++---
 io/channel-websock.c |  1 +
 2 files changed, 16 insertions(+), 11 deletions(-)

diff --git a/io/channel-tls.c b/io/channel-tls.c
index 1d9c9c72bfb..67b9760 100644
--- a/io/channel-tls.c
+++ b/io/channel-tls.c
@@ -69,37 +69,40 @@ qio_channel_tls_new_server(QIOChannel *master,
const char *aclname,
Error **errp)
 {
-QIOChannelTLS *ioc;
+QIOChannelTLS *tioc;
+QIOChannel *ioc;

-ioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS));
+tioc = QIO_CHANNEL_TLS(object_new(TYPE_QIO_CHANNEL_TLS));
+ioc = QIO_CHANNEL(tioc);

-ioc->master = master;
+tioc->master = master;
+ioc->follow_coroutine_ctx = master->follow_coroutine_ctx;
 if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
-qio_channel_set_feature(QIO_CHANNEL(ioc), 
QIO_CHANNEL_FEATURE_SHUTDOWN);
+qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
 }
 object_ref(OBJECT(master));

-ioc->session = qcrypto_tls_session_new(
+tioc->session = qcrypto_tls_session_new(
 creds,
 NULL,
 aclname,
 QCRYPTO_TLS_CREDS_ENDPOINT_SERVER,
 errp);
-if (!ioc->session) {
+if (!tioc->session) {
 goto error;
 }

 qcrypto_tls_session_set_callbacks(
-ioc->session,
+tioc->session,
 qio_channel_tls_write_handler,
 qio_channel_tls_read_handler,
-ioc);
+tioc);

-trace_qio_channel_tls_new_server(ioc, master, creds, aclname);
-return ioc;
+trace_qio_channel_tls_new_server(tioc, master, creds, aclname);
+return tioc;

  error:
-object_unref(OBJECT(ioc));
+object_unref(OBJECT(tioc));
 return NULL;
 }

@@ -116,6 +119,7 @@ qio_channel_tls_new_client(QIOChannel *master,
 ioc = QIO_CHANNEL(tioc);

 tioc->master = master;
+ioc->follow_coroutine_ctx = master->follow_coroutine_ctx;
 if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
 qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
 }
diff --git a/io/channel-websock.c b/io/channel-websock.c
index a12acc27cf2..de39f0d182d 100644
--- a/io/channel-websock.c
+++ b/io/channel-websock.c
@@ -883,6 +883,7 @@ qio_channel_websock_new_server(QIOChannel *master)
 ioc = QIO_CHANNEL(wioc);

 wioc->master = master;
+ioc->follow_coroutine_ctx = master->follow_coroutine_ctx;
 if (qio_channel_has_feature(master, QIO_CHANNEL_FEATURE_SHUTDOWN)) {
 qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
 }
-- 
2.45.0

[PATCH v2 2/2] iotests: test NBD+TLS+iothread

2024-05-17 Thread Eric Blake

Prevent regressions when using NBD with TLS in the presence of
iothreads, adding coverage the fix to qio channels made in the
previous patch.

CC: qemu-sta...@nongnu.org
Signed-off-by: Eric Blake 
---
 tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++
 tests/qemu-iotests/tests/nbd-tls-iothread.out |  54 ++
 2 files changed, 224 insertions(+)
 create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread
 create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out

diff --git a/tests/qemu-iotests/tests/nbd-tls-iothread 
b/tests/qemu-iotests/tests/nbd-tls-iothread
new file mode 100755
index 000..a737224a90e
--- /dev/null
+++ b/tests/qemu-iotests/tests/nbd-tls-iothread
@@ -0,0 +1,170 @@
+#!/usr/bin/env bash
+# group: rw quick
+#
+# Test of NBD+TLS+iothread
+#
+# Copyright (C) 2024 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=ebl...@redhat.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1# failure is the default!
+
+_cleanup()
+{
+_cleanup_qemu
+_cleanup_test_img
+rm -f "$dst_image"
+tls_x509_cleanup
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+cd ..
+. ./common.rc
+. ./common.filter
+. ./common.qemu
+. ./common.tls
+. ./common.nbd
+
+_supported_fmt qcow2  # Hardcoded to qcow2 command line and QMP below
+_supported_proto file
+_require_command QEMU_NBD
+
+# pick_unused_port
+# Copied from nbdkit/tests/functions.sh.in with compatible 2-clause BSD license
+#
+# Picks and returns an "unused" port, setting the global variable
+# $port.
+#
+# This is inherently racy, but we need it because qemu does not currently
+# permit NBD+TLS over a Unix domain socket
+pick_unused_port ()
+{
+if ! (ss --version) >/dev/null 2>&1; then
+_notrun "ss utility required, skipped this test"
+fi
+
+# Start at a random port to make it less likely that two parallel
+# tests will conflict.
+port=$(( 5 + (RANDOM%15000) ))
+while ss -ltn | grep -sqE ":$port\b"; do
+((port++))
+if [ $port -eq 65000 ]; then port=5; fi
+done
+echo picked unused port
+}
+
+tls_x509_init
+
+size=1G
+DST_IMG="$TEST_DIR/dst.qcow2"
+
+echo
+echo "== preparing TLS creds and spare port =="
+
+pick_unused_port
+tls_x509_create_root_ca "ca1"
+tls_x509_create_server "ca1" "server1"
+tls_x509_create_client "ca1" "client1"
+tls_obj_base=tls-creds-x509,id=tls0,verify-peer=true,dir="${tls_dir}"
+
+echo
+echo "== preparing image =="
+
+_make_test_img $size
+$QEMU_IMG create -f qcow2 "$DST_IMG" $size
+
+echo
+echo === Starting Src QEMU ===
+echo
+
+_launch_qemu -machine q35 \
+-object iothread,id=iothread0 \
+-object "${tls_obj_base}"/client1,endpoint=client \
+-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true,
+  "bus":"pcie.0"}' \
+-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0",
+  "bus":"root0", "iothread":"iothread0"}' \
+-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1",
+  "bus":"virtio_scsi_pci0.0"}' \
+-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false},
+"filename":"'"$TEST_IMG"'", "node-name":"drive_sys1"}' \
+-blockdev '{"driver":"qcow2", "node-name":"drive_image1",
+"file":"drive_sys1"}'
+h1=$QEMU_HANDLE
+_send_qemu_cmd $h1 '{"execute": "qmp_capabilities"}' 'return'
+
+echo
+echo === Starting Dst VM2 ===
+echo
+
+_launch_qemu -machine q35 \
+-object iothread,id=iothread0 \
+-object "${tls_obj_base}"/server1,endpoint=server \
+-device '{"driver":"pcie-root-port", "id":"root0", "multifunction":true,
+  "bus":"pcie.0"}' \
+-device '{"driver":"virtio-scsi-pci", "id":"virtio_scsi_pci0",
+  "bus":"root0", "iothread":"iothread0"}' \
+-device '{"driver":"scsi-hd", "id":"image1", "drive":"drive_image1",
+  "bus":"virtio_scsi_pci0.0"}' \
+-blockdev '{"driver":"file", "cache":{"direct":true, "no-flush":false},
+"filename":"'"$DST_IMG"'", "node-name":"drive_sys1"}' \
+-blockdev '{"driver":"qcow2", "node-name":"drive_image1",
+"file":"drive_sys1"}' \
+-incoming defer
+h2=$QEMU_HANDLE
+_send_qemu_cmd $h2 '{"execute": "qmp_capabilities"}' 'return'
+
+echo
+echo === Dst VM: Enable NBD server for

[PATCH v2 0/2] Fix NBD+TLS regression in presence of iothread

2024-05-17 Thread Eric Blake

In v2:
- correct list email address
- add iotest
- add R-b

I'm offline next week, and have been communicating with Stefan who may
want to push this through his block tree instead of waiting for me to
get back.

Eric Blake (2):
  qio: Inherit follow_coroutine_ctx across TLS
  iotests: test NBD+TLS+iothread

 io/channel-tls.c  |  26 +--
 io/channel-websock.c  |   1 +
 tests/qemu-iotests/tests/nbd-tls-iothread | 170 ++
 tests/qemu-iotests/tests/nbd-tls-iothread.out |  54 ++
 4 files changed, 240 insertions(+), 11 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/nbd-tls-iothread
 create mode 100644 tests/qemu-iotests/tests/nbd-tls-iothread.out

-- 
2.45.0

Re: [PATCH 8/9] migration: Add support for fdset with multifd + file

2024-05-17 Thread Fabiano Rosas

Daniel P. Berrangé  writes:

> On Wed, May 08, 2024 at 05:39:53PM -0300, Fabiano Rosas wrote:
>> Peter Xu  writes:
>> 
>> > On Wed, May 08, 2024 at 09:53:48AM +0100, Daniel P. Berrangé wrote:
>> >> On Fri, Apr 26, 2024 at 11:20:41AM -0300, Fabiano Rosas wrote:
>> >> > Allow multifd to use an fdset when migrating to a file. This is useful
>> >> > for the scenario where the management layer wants to have control over
>> >> > the migration file.
>> >> > 
>> >> > By receiving the file descriptors directly, QEMU can delegate some
>> >> > high level operating system operations to the management layer (such
>> >> > as mandatory access control). The management layer might also want to
>> >> > add its own headers before the migration stream.
>> >> > 
>> >> > Enable the "file:/dev/fdset/#" syntax for the multifd migration with
>> >> > mapped-ram. The requirements for the fdset mechanism are:
>> >> > 
>> >> > On the migration source side:
>> >> > 
>> >> > - the fdset must contain two fds that are not duplicates between
>> >> >   themselves;
>> >> > - if direct-io is to be used, exactly one of the fds must have the
>> >> >   O_DIRECT flag set;
>> >> > - the file must be opened with WRONLY both times.
>> >> > 
>> >> > On the migration destination side:
>> >> > 
>> >> > - the fdset must contain one fd;
>> >> > - the file must be opened with RDONLY.
>> >> > 
>> >> > Signed-off-by: Fabiano Rosas 
>> >> > ---
>> >> >  docs/devel/migration/main.rst   | 18 ++
>> >> >  docs/devel/migration/mapped-ram.rst |  6 -
>> >> >  migration/file.c| 38 -
>> >> >  3 files changed, 60 insertions(+), 2 deletions(-)
>> >> > 
>> >> > diff --git a/docs/devel/migration/main.rst 
>> >> > b/docs/devel/migration/main.rst
>> >> > index 54385a23e5..50f6096470 100644
>> >> > --- a/docs/devel/migration/main.rst
>> >> > +++ b/docs/devel/migration/main.rst
>> >> > @@ -47,6 +47,24 @@ over any transport.
>> >> >QEMU interference. Note that QEMU does not flush cached file
>> >> >data/metadata at the end of migration.
>> >> >  
>> >> > +  The file migration also supports using a file that has already been
>> >> > +  opened. A set of file descriptors is passed to QEMU via an "fdset"
>> >> > +  (see add-fd QMP command documentation). This method allows a
>> >> > +  management application to have control over the migration file
>> >> > +  opening operation. There are, however, strict requirements to this
>> >> > +  interface:
>> >> > +
>> >> > +  On the migration source side:
>> >> > +- if the multifd capability is to be used, the fdset must contain
>> >> > +  two file descriptors that are not duplicates between themselves;
>> >> > +- if the direct-io capability is to be used, exactly one of the
>> >> > +  file descriptors must have the O_DIRECT flag set;
>> >> > +- the file must be opened with WRONLY.
>> >> > +
>> >> > +  On the migration destination side:
>> >> > +- the fdset must contain one file descriptor;
>> >> > +- the file must be opened with RDONLY.
>> >> > +
>> >> >  In addition, support is included for migration using RDMA, which
>> >> >  transports the page data using ``RDMA``, where the hardware takes care 
>> >> > of
>> >> >  transporting the pages, and the load on the CPU is much lower.  While 
>> >> > the
>> >> > diff --git a/docs/devel/migration/mapped-ram.rst 
>> >> > b/docs/devel/migration/mapped-ram.rst
>> >> > index fa4cefd9fc..e6505511f0 100644
>> >> > --- a/docs/devel/migration/mapped-ram.rst
>> >> > +++ b/docs/devel/migration/mapped-ram.rst
>> >> > @@ -16,7 +16,7 @@ location in the file, rather than constantly being 
>> >> > added to a
>> >> >  sequential stream. Having the pages at fixed offsets also allows the
>> >> >  usage of O_DIRECT for save/restore of the migration stream as the
>> >> >  pages are ensured to be written respecting O_DIRECT alignment
>> >> > -restrictions (direct-io support not yet implemented).
>> >> > +restrictions.
>> >> >  
>> >> >  Usage
>> >> >  -
>> >> > @@ -35,6 +35,10 @@ Use a ``file:`` URL for migration:
>> >> >  Mapped-ram migration is best done non-live, i.e. by stopping the VM on
>> >> >  the source side before migrating.
>> >> >  
>> >> > +For best performance enable the ``direct-io`` capability as well:
>> >> > +
>> >> > +``migrate_set_capability direct-io on``
>> >> > +
>> >> >  Use-cases
>> >> >  -
>> >> >  
>> >> > diff --git a/migration/file.c b/migration/file.c
>> >> > index b9265b14dd..3bc8bc7463 100644
>> >> > --- a/migration/file.c
>> >> > +++ b/migration/file.c
>> >> > @@ -17,6 +17,7 @@
>> >> >  #include "io/channel-file.h"
>> >> >  #include "io/channel-socket.h"
>> >> >  #include "io/channel-util.h"
>> >> > +#include "monitor/monitor.h"
>> >> >  #include "options.h"
>> >> >  #include "trace.h"
>> >> >  
>> >> > @@ -54,10 +55,18 @@ static void file_remove_fdset(void)
>> >> >  }
>> >> >  }
>> >> >  
>> >> > +/*
>> >> > + * With multifd, due to the behavior of the dup()

[PATCH v2 1/1] riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()

2024-05-17 Thread Daniel Henrique Barboza

Commit 33a24910ae changed 'reg_width' to use 'vlenb', i.e. vector length
in bytes, when in this context we want 'reg_width' as the length in
bits.

Fix 'reg_width' back to the value in bits like 7cb59921c05a
("target/riscv/gdbstub.c: use 'vlenb' instead of shifting 'vlen'") set
beforehand.

While we're at it, rename 'reg_width' to 'bitsize' to provide a bit more
clarity about what the variable represents. 'bitsize' is also used in
riscv_gen_dynamic_csr_feature() with the same purpose, i.e. as an input to
gdb_feature_builder_append_reg().

Cc: Akihiko Odaki 
Cc: Alex Bennée 
Reported-by: Robin Dapp 
Fixes: 33a24910ae ("target/riscv: Use GDBFeature for dynamic XML")
Signed-off-by: Daniel Henrique Barboza 
Reviewed-by: LIU Zhiwei 
Acked-by: Alex Bennée 
---
 target/riscv/gdbstub.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
index d0cc5762c2..c07df972f1 100644
--- a/target/riscv/gdbstub.c
+++ b/target/riscv/gdbstub.c
@@ -288,7 +288,7 @@ static GDBFeature *riscv_gen_dynamic_csr_feature(CPUState 
*cs, int base_reg)
 static GDBFeature *ricsv_gen_dynamic_vector_feature(CPUState *cs, int base_reg)
 {
 RISCVCPU *cpu = RISCV_CPU(cs);
-int reg_width = cpu->cfg.vlenb;
+int bitsize = cpu->cfg.vlenb << 3;
 GDBFeatureBuilder builder;
 int i;
 
@@ -298,7 +298,7 @@ static GDBFeature 
*ricsv_gen_dynamic_vector_feature(CPUState *cs, int base_reg)
 
 /* First define types and totals in a whole VL */
 for (i = 0; i < ARRAY_SIZE(vec_lanes); i++) {
-int count = reg_width / vec_lanes[i].size;
+int count = bitsize / vec_lanes[i].size;
 gdb_feature_builder_append_tag(
 , "",
 vec_lanes[i].id, vec_lanes[i].gdb_type, count);
@@ -316,7 +316,7 @@ static GDBFeature 
*ricsv_gen_dynamic_vector_feature(CPUState *cs, int base_reg)
 /* Define vector registers */
 for (i = 0; i < 32; i++) {
 gdb_feature_builder_append_reg(, g_strdup_printf("v%d", i),
-   reg_width, i, "riscv_vector", "vector");
+   bitsize, i, "riscv_vector", "vector");
 }
 
 gdb_feature_builder_end();
-- 
2.44.0

[PATCH v2 0/1] riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()

2024-05-17 Thread Daniel Henrique Barboza

Hi,

In this v2 'reg_width' was renamed to 'bitsize' to provide a bit more
clarity about what's the value type of the variable. It is the same name
used by riscv_gen_dynamic_csr_feature() for a variable that has the same
purpose. The variable rename was suggested by Alex in v1.

Changes from v1:
- rename 'reg_width' to 'bitsize'
- v1 link: 
https://lore.kernel.org/qemu-riscv/20240516171010.639591-1-dbarb...@ventanamicro.com/

Daniel Henrique Barboza (1):
  riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()

 target/riscv/gdbstub.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

-- 
2.44.0

Re: [PATCH v3 2/5] ppc/pnv: Extend SPI model

2024-05-17 Thread Miles Glenn

Chalapathi,

I'm having trouble seeing the benefit of breaking this commit out from
patch 1/5.  It seems like the two should be merged into a single commit
responsible for adding the PNV SPI Controller model.

-Glenn


On Thu, 2024-05-16 at 11:33 -0500, Chalapathi V wrote:
> In this commit SPI shift engine and sequencer logic is implemented.
> Shift engine performs serialization and de-serialization according to
> the
> control by the sequencer and according to the setup defined in the
> configuration registers. Sequencer implements the main control logic
> and
> FSM to handle data transmit and data receive control of the shift
> engine.
> 
> Signed-off-by: Chalapathi V 
> ---
>  include/hw/ssi/pnv_spi.h|   28 +
>  hw/ppc/pnv_spi_controller.c | 1074
> +++
>  hw/ppc/trace-events |   15 +
>  3 files changed, 1117 insertions(+)
> 
> diff --git a/include/hw/ssi/pnv_spi.h b/include/hw/ssi/pnv_spi.h
> index 244ee1cfc0..6e2bceab3b 100644
> --- a/include/hw/ssi/pnv_spi.h
> +++ b/include/hw/ssi/pnv_spi.h
> @@ -8,6 +8,14 @@
>   * This model Supports a connection to a single SPI responder.
>   * Introduced for P10 to provide access to SPI seeproms, TPM, flash
> device
>   * and an ADC controller.
> + *
> + * All SPI function control is mapped into the SPI register space to
> enable
> + * full control by firmware.
> + *
> + * SPI Controller has sequencer and shift engine. The SPI shift
> engine
> + * performs serialization and de-serialization according to the
> control by
> + * the sequencer and according to the setup defined in the
> configuration
> + * registers and the SPI sequencer implements the main control
> logic.
>   */
>  #include "hw/ssi/ssi.h"
>  
> @@ -29,6 +37,25 @@ typedef struct PnvSpiController {
>  MemoryRegionxscom_spic_regs;
>  /* SPI controller object number */
>  uint32_tspic_num;
> +uint8_t transfer_len;
> +uint8_t responder_select;
> +/* To verify if shift_n1 happens prior to shift_n2 */
> +boolshift_n1_done;
> +/* Loop counter for branch operation opcode Ex/Fx */
> +uint8_t loop_counter_1;
> +uint8_t loop_counter_2;
> +/* N1/N2_bits specifies the size of the N1/N2 segment of a frame
> in bits.*/
> +uint8_t N1_bits;
> +uint8_t N2_bits;
> +/* Number of bytes in a payload for the N1/N2 frame segment.*/
> +uint8_t N1_bytes;
> +uint8_t N2_bytes;
> +/* Number of N1/N2 bytes marked for transmit */
> +uint8_t N1_tx;
> +uint8_t N2_tx;
> +/* Number of N1/N2 bytes marked for receive */
> +uint8_t N1_rx;
> +uint8_t N2_rx;
>  
>  /* SPI Controller registers */
>  uint64_terror_reg;
> @@ -40,5 +67,6 @@ typedef struct PnvSpiController {
>  uint64_treceive_data_reg;
>  uint8_t sequencer_operation_reg[SPI_CONTROLLER_REG_SIZE]
> ;
>  uint64_tstatus_reg;
> +
>  } PnvSpiController;
>  #endif /* PPC_PNV_SPI_CONTROLLER_H */
> diff --git a/hw/ppc/pnv_spi_controller.c
> b/hw/ppc/pnv_spi_controller.c
> index 11b119cf0f..e87f583074 100644
> --- a/hw/ppc/pnv_spi_controller.c
> +++ b/hw/ppc/pnv_spi_controller.c
> @@ -19,6 +19,1072 @@
>  #include "hw/irq.h"
>  #include "trace.h"
>  
> +/* PnvXferBuffer */
> +typedef struct PnvXferBuffer {
> +
> +uint32_tlen;
> +uint8_t*data;
> +
> +} PnvXferBuffer;
> +
> +/* pnv_spi_xfer_buffer_methods */
> +static PnvXferBuffer *pnv_spi_xfer_buffer_new(void)
> +{
> +PnvXferBuffer *payload = g_malloc0(sizeof(*payload));
> +
> +return payload;
> +}
> +
> +static void pnv_spi_xfer_buffer_free(PnvXferBuffer *payload)
> +{
> +free(payload->data);
> +free(payload);
> +}
> +
> +static uint8_t *pnv_spi_xfer_buffer_write_ptr(PnvXferBuffer
> *payload,
> +uint32_t offset, uint32_t length)
> +{
> +if (payload->len < (offset + length)) {
> +payload->len = offset + length;
> +payload->data = g_realloc(payload->data, payload->len);
> +}
> +return >data[offset];
> +}
> +
> +static bool does_rdr_match(PnvSpiController *s)
> +{
> +/*
> + * According to spec, the mask bits that are 0 are compared and
> the
> + * bits that are 1 are ignored.
> + */
> +uint16_t rdr_match_mask =
> GETFIELD(MEMORY_MAPPING_REG_RDR_MATCH_MASK,
> +s->memory_mapping_reg);
> +uint16_t rdr_match_val =
> GETFIELD(MEMORY_MAPPING_REG_RDR_MATCH_VAL,
> +s->memory_mapping_reg);
> +
> +if ((~rdr_match_mask & rdr_match_val) == ((~rdr_match_mask) &
> +GETFIELD(PPC_BITMASK(48, 63), s->receive_data_reg))) {
> +return true;
> +}
> +return false;
> +}
> +
> +static uint8_t get_from_offset(PnvSpiController *s, uint8_t offset)
> +{
> +uint8_t byte;
> +
> +/*
> + * Offset is an index between 0 and SPI_CONTROLLER_REG_SIZE - 1
> +

Re: [PATCH v3 4/5] hw/ppc: SPI controller wiring to P10 chip

2024-05-17 Thread Miles Glenn

Reviewed-by: Glenn Miles 

-Glenn

On Thu, 2024-05-16 at 11:33 -0500, Chalapathi V wrote:
> In this commit, create SPI controller on p10 chip and connect cs irq.
> 
> The QOM tree of spi controller and seeprom are.
> /machine (powernv10-machine)
>   /chip[0] (power10_v2.0-pnv-chip)
> /pib_spic[2] (pnv-spi-controller)
>   /pnv-spi-bus.2 (SSI)
>   /xscom-spi-controller-regs[0] (memory-region)
> 
> /machine (powernv10-machine)
>   /peripheral-anon (container)
> /device[0] (25csm04)
>   /WP#[0] (irq)
>   /ssi-gpio-cs[0] (irq)
> 
> (qemu) qom-get /machine/peripheral-anon /device[76] "parent_bus"
> "/machine/chip[0]/pib_spic[2]/pnv-spi-bus.2"
> 
> Signed-off-by: Chalapathi V 
> ---
>  include/hw/ppc/pnv_chip.h   |  3 +++
>  hw/ppc/pnv.c| 21 -
>  hw/ppc/pnv_spi_controller.c |  8 
>  3 files changed, 31 insertions(+), 1 deletion(-)
> 
> diff --git a/include/hw/ppc/pnv_chip.h b/include/hw/ppc/pnv_chip.h
> index 8589f3291e..d464858f79 100644
> --- a/include/hw/ppc/pnv_chip.h
> +++ b/include/hw/ppc/pnv_chip.h
> @@ -6,6 +6,7 @@
>  #include "hw/ppc/pnv_core.h"
>  #include "hw/ppc/pnv_homer.h"
>  #include "hw/ppc/pnv_n1_chiplet.h"
> +#include "hw/ssi/pnv_spi.h"
>  #include "hw/ppc/pnv_lpc.h"
>  #include "hw/ppc/pnv_occ.h"
>  #include "hw/ppc/pnv_psi.h"
> @@ -118,6 +119,8 @@ struct Pnv10Chip {
>  PnvSBE   sbe;
>  PnvHomer homer;
>  PnvN1Chiplet n1_chiplet;
> +#define PNV10_CHIP_MAX_PIB_SPIC 6
> +PnvSpiController pib_spic[PNV10_CHIP_MAX_PIB_SPIC];
>  
>  uint32_t nr_quads;
>  PnvQuad  *quads;
> diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c
> index 6e3a5ccdec..6850592a85 100644
> --- a/hw/ppc/pnv.c
> +++ b/hw/ppc/pnv.c
> @@ -1829,6 +1829,11 @@ static void
> pnv_chip_power10_instance_init(Object *obj)
>  for (i = 0; i < pcc->i2c_num_engines; i++) {
>  object_initialize_child(obj, "i2c[*]", >i2c[i],
> TYPE_PNV_I2C);
>  }
> +
> +for (i = 0; i < PNV10_CHIP_MAX_PIB_SPIC ; i++) {
> +object_initialize_child(obj, "pib_spic[*]", 
> >pib_spic[i],
> +TYPE_PNV_SPI_CONTROLLER);
> +}
>  }
>  
>  static void pnv_chip_power10_quad_realize(Pnv10Chip *chip10, Error
> **errp)
> @@ -2043,7 +2048,21 @@ static void
> pnv_chip_power10_realize(DeviceState *dev, Error **errp)
>qdev_get_gpio_in(DEVICE(>psi),
> PSIHB9_IRQ_SBE_I2C));
>  }
> -
> +/* PIB SPI Controller */
> +for (i = 0; i < PNV10_CHIP_MAX_PIB_SPIC; i++) {
> +object_property_set_int(OBJECT(>pib_spic[i]),
> "spic_num",
> +i, _fatal);
> +/* pib_spic[2] connected to 25csm04 which implements 1 byte
> transfer */
> +object_property_set_int(OBJECT(>pib_spic[i]),
> "transfer_len",
> +(i == 2) ? 1 : 4, _fatal);
> +if (!sysbus_realize(SYS_BUS_DEVICE(OBJECT
> +(>pib_spic[i])),
> errp)) {
> +return;
> +}
> +pnv_xscom_add_subregion(chip, PNV10_XSCOM_PIB_SPIC_BASE +
> +i * PNV10_XSCOM_PIB_SPIC_SIZE,
> +
> >pib_spic[i].xscom_spic_regs);
> +}
>  }
>  
>  static void pnv_rainier_i2c_init(PnvMachineState *pnv)
> diff --git a/hw/ppc/pnv_spi_controller.c
> b/hw/ppc/pnv_spi_controller.c
> index e87f583074..3d47e932de 100644
> --- a/hw/ppc/pnv_spi_controller.c
> +++ b/hw/ppc/pnv_spi_controller.c
> @@ -1067,9 +1067,17 @@ static void
> operation_sequencer(PnvSpiController *s)
>  static void do_reset(DeviceState *dev)
>  {
>  PnvSpiController *s = PNV_SPICONTROLLER(dev);
> +DeviceState *ssi_dev;
>  
>  trace_pnv_spi_reset();
>  
> +/* Connect cs irq */
> +ssi_dev = ssi_get_cs(s->ssi_bus, 0);
> +if (ssi_dev) {
> +qemu_irq cs_line = qdev_get_gpio_in_named(ssi_dev,
> SSI_GPIO_CS, 0);
> +qdev_connect_gpio_out_named(DEVICE(s), "cs", 0, cs_line);
> +}
> +
>  /* Reset all N1 and N2 counters, and other constants */
>  s->N2_bits = 0;
>  s->N2_bytes = 0;

Re: [PATCH v2 2/3] docs: define policy limiting the inclusion of generated files

2024-05-17 Thread Alex Bennée

Daniel P. Berrangé  writes:


> +
> +IOW, using coccinelle to convert code from one pattern to another pattern, or
> +fixing docs typos with a spell checker, or transforming code using sed / awk 
> /
> +etc, are not considered to be acts of code generation. Where an automated
> +manipulation is performed on code, however, this should be declared in the
> +commit message.

Lets avoid IRC speak in documents (s/IOW/In other words/), otherwise:

Reviewed-by: Alex Bennée 


> +
> +At times contributors may use or create scripts/tools to generate an initial
> +boilerplate code template which is then filled in to produce the final patch.
> +The output of such a tool would still be considered the "preferred format",
> +since it is intended to be a foundation for further human authored changes.
> +Such tools are acceptable to use, provided they follow a deterministic 
> process
> +and there is clearly defined copyright and licensing for their output.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: [PATCH v2 1/3] docs: introduce dedicated page about code provenance / sign-off

2024-05-17 Thread Alex Bennée

Daniel P. Berrangé  writes:

> Currently we have a short paragraph saying that patches must include
> a Signed-off-by line, and merely link to the kernel documentation.
> The linked kernel docs have a lot of content beyond the part about
> sign-off an thus are misleading/distracting to QEMU contributors.
>
> This introduces a dedicated 'code-provenance' page in QEMU talking
> about why we require sign-off, explaining the other tags we commonly
> use, and what to do in some edge cases.
>

> +
> +Other commit tags
> +~
> +
> +While the ``Signed-off-by`` tag is mandatory, there are a number of other 
> tags
> +that are commonly used during QEMU development:
> +
> + * **``Reviewed-by``**: when a QEMU community member reviews a patch on the
> +   mailing list, if they consider the patch acceptable, they should send an
> +   email reply containing a ``Reviewed-by`` tag. Subsystem maintainers who
> +   review a patch should add this even if they are also adding their
> +   ``Signed-off-by`` to the same commit.
> +
> + * **``Acked-by``**: when a QEMU subsystem maintainer approves a patch that
> +   touches their subsystem, but intends to allow a different maintainer to
> +   queue it and send a pull request, they would send a mail containing a
> +   ``Acked-by`` tag. Where a patch touches multiple subsystems, ``Acked-by``
> +   only implies review of the maintainers' own areas of responsibility. If a
> +   maintainer wants to indicate they have done a full review they should use
> +   a ``Reviewed-by`` tag.
> +
> + * **``Tested-by``**: when a QEMU community member has functionally tested 
> the
> +   behaviour of the patch in some manner, they should send an email reply
> +   containing a ``Tested-by`` tag.
> +
> + * **``Reported-by``**: when a QEMU community member reports a problem via 
> the
> +   mailing list, or some other informal channel that is not the issue 
> tracker,
> +   it is good practice to credit them by including a ``Reported-by`` tag on
> +   any patch fixing the issue. When the problem is reported via the GitLab
> +   issue tracker, however, it is sufficient to just include a link to the
> +   issue.
> +
> + * **``Suggested-by``**: when a reviewer or other 3rd party makes non-trivial
> +   suggestions for how to change a patch, it is good practice to credit them
> +   by including a ``Suggested-by`` tag.

Should we mention our use of Message-Id in so far the informal good
practice is that we keep the Message-Id's of the last time a patch was
posted and potentially the message-ids of previous posters?

But this is definitely an improvement of what we had before so:

Reviewed-by: Alex Bennée 


> +
> +Subsystem maintainer requirements
> +~
> +
> +When a subsystem maintainer accepts a patch from a contributor, in addition 
> to
> +the normal code review points, they are expected to validate the presence of
> +suitable ``Signed-off-by`` tags.
> +
> +At the time they queue the patch in their subsystem tree, the maintainer
> +**must** also then add their own ``Signed-off-by`` to indicate that they have
> +done the aforementioned validation. This is in addition to any of their own
> +``Reviewed-by`` tags the subsystem maintainer may wish to include.
> +
> +Tools for adding ``Signed-off-by``
> +~~
> +
> +There are a variety of ways tools can support adding ``Signed-off-by`` tags
> +for patches, avoiding the need for contributors to manually type in this
> +repetitive text each time.
> +
> +git commands
> +
> +
> +When creating, or amending, a commit the ``-s`` flag to ``git commit`` will
> +append a suitable line matching the configuring git author details.
> +
> +If preparing patches using the ``git format-patch`` tool, the ``-s`` flag can
> +be used to append a suitable line in the emails it creates, without modifying
> +the local commits. Alternatively to modify all the local commits on a 
> branch::
> +
> +  git rebase master -x 'git commit --amend --no-edit -s'
> +
> +emacs
> +^
> +
> +In the file ``$HOME/.emacs.d/abbrev_defs`` add::
> +
> +  (define-abbrev-table 'global-abbrev-table
> +'(
> +  ("8rev" "Reviewed-by: YOUR NAME " nil 1)
> +  ("8ack" "Acked-by: YOUR NAME " nil 1)
> +  ("8test" "Tested-by: YOUR NAME " nil 1)
> +  ("8sob" "Signed-off-by: YOUR NAME " nil 1)
> + ))
> +
> +with this change, if you type (for example) ``8rev`` followed by 
> +or  it will expand to the whole phrase.
> +
> +vim
> +^^^
> +
> +In the file ``$HOME/.vimrc`` add::
> +
> +  iabbrev 8rev Reviewed-by: YOUR NAME 
> +  iabbrev 8ack Acked-by: YOUR NAME 
> +  iabbrev 8test Tested-by: YOUR NAME 
> +  iabbrev 8sob Signed-off-by: YOUR NAME 
> +
> +with this change, if you type (for example) ``8rev`` followed by 
> +or  it will expand to the whole phrase.
> +
> +Re-starting abandoned work
> +~~
> +
> +For a variety of reasons there are some patches that get submitted

Re: CXL numa error on arm64 qemu virt machine

2024-05-17 Thread Jonathan Cameron via

On Fri, 17 May 2024 11:14:41 +0100
Jonathan Cameron  wrote:

> On Fri, 17 May 2024 18:07:07 +0800
> Yuquan Wang  wrote:
> 
> > On Fri, May 10, 2024 at 06:16:46PM +0100, Jonathan Cameron wrote:  
> > > 
> > > https://git.kernel.org/pub/scm/linux/kernel/git/jic23/cxl-staging.git/log/?h=arm-numa-fixes
> > > 
> > Thank you :)  
> > > I've run out of time to sort out cover letters and things + just before 
> > > the merge
> > > window is never a good time get anyone to pay attention to potentially 
> > > controversial
> > > patches.  So for now I've thrown up a branch on kernel.org with Robert's
> > > series of fixes of related code (that's queued in the ACPI tree for the 
> > > merge window)
> > > and Dan Williams (from several years ago) + my additions that 'work' 
> > > (lightly tested)
> > > on qemu/arm64 with the generic port patches etc. 
> > > 
> > > I'll send out an RFC in a couple of weeks.  In meantime let me know if you
> > > run into any problems or have suggestions to improve them.
> > > 
> > > Jonathan
> > >
> > With the latest commit(d077bf9) in the 'arm-numa-fixes', the qemu virt
> > could create a cxl region with a new numa node (node 2) just like x86.
> > At this stage(the first time to create cxl region), everything works
> > fine.
> > 
> > However, if I use below commands to delete the created cxl region:
> > 
> > `daxctl offline-memory dax0.0`
> > `cxl disable-region region0`
> > `cxl destroy-region region0`
> > 
> > and then recreate it by `cxl create-region -d decoder0.0 -t ram`, the
> > kernel could not create the numa node2 again, and the kernel will print:
> > 
> > [  589.458971] Fallback order for Node 0: 0 1
> > [  589.459136] Fallback order for Node 1: 1 0
> > [  589.459175] Fallback order for Node 2: 0 1
> > [  589.459213] Built 2 zonelists, mobility grouping on.  Total pages: 
> > 1009890
> > [  589.459284] Policy zone: Normal  
> 
> I'll see if I can figure out what is happening there.

So I know what is happening but not sure on the solution yet.
The issue is on unbind of the region there is a call to try_remove_memory()
and that calls memblock_phys_free(). That removes the reserved memblocks being 
used
for tracking the numa node, so when you bind a region at that HPA again, there
is no tracking information.

So far I haven't figured out why that call is there in the first place
which isn't helping me solve this.

https://elixir.bootlin.com/linux/v6.9.1/source/mm/memory_hotplug.c#L2286

Until I get this code out there, kind of hard to ask the mm folk
- for now I may just have to say it only works once and point at that
line as the problem in an RFC.

Long shot, but Dan, did you run into this when you were doing your 
[PATCH v2 08/22] memblock: Introduce a generic phys_addr_to_target_node()
stuff?  I assume that ultimately called try_remove_memory() in a remove
path somewhere and similarly to this if you try putting it back it
would be missing.  Or alternatively, any idea why what that memblock_phys_free()
is balancing with?

Jonathan




> > 
> > Meanwhile, the qemu reports that: 
> > 
> > "qemu-system-aarch64: virtio: bogus descriptor or out of resources"  
> 
> That sounds like another TCG issue, or possibly the DMA bounce buffer
> problem resurfacing.  It's not directly related to his NUMA aspect unless
> something very odd is going on.  I'm even more confused because I think
> you are not using kmem with the above commands, so we shouldn't be using
> the CXL memory for virtio.
> 
> Just to check, you aren't running with KVM I hope?  That opens a much
> bigger problem set. :(
> 
> Jonathan
> 
> 
> 
> > 
> > Many thanks
> > Yuquan
> >   
> 
>

Re: [PATCH 1/1] riscv, gdbstub.c: fix reg_width in ricsv_gen_dynamic_vector_feature()

2024-05-17 Thread Alex Bennée

Daniel Henrique Barboza  writes:

> Commit 33a24910ae changed 'reg_width' to use 'vlenb', i.e. vector length
> in bytes, when in this context we want 'reg_width' as the length in
> bits.
>
> Fix 'reg_width' back to the value in bits like 7cb59921c05a
> ("target/riscv/gdbstub.c: use 'vlenb' instead of shifting 'vlen'") set
> beforehand.
>
> Cc: Akihiko Odaki 
> Cc: Alex Bennée 
> Reported-by: Robin Dapp 
> Fixes: 33a24910ae ("target/riscv: Use GDBFeature for dynamic XML")
> Signed-off-by: Daniel Henrique Barboza 
> ---
>  target/riscv/gdbstub.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/riscv/gdbstub.c b/target/riscv/gdbstub.c
> index d0cc5762c2..358158c42a 100644
> --- a/target/riscv/gdbstub.c
> +++ b/target/riscv/gdbstub.c
> @@ -288,7 +288,7 @@ static GDBFeature *riscv_gen_dynamic_csr_feature(CPUState 
> *cs, int base_reg)
>  static GDBFeature *ricsv_gen_dynamic_vector_feature(CPUState *cs, int 
> base_reg)
>  {
>  RISCVCPU *cpu = RISCV_CPU(cs);
> -int reg_width = cpu->cfg.vlenb;
> +int reg_width = cpu->cfg.vlenb << 3;

You could consider renaming the var to reg_bits for clarity but
otherwise:

Acked-by: Alex Bennée 


>  GDBFeatureBuilder builder;
>  int i;

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro

Re: CPR/liveupdate: test results using prior bug fix

2024-05-17 Thread Michael Galaxy


OK, acknowledged. Thanks, All.

- Michael

On 5/16/24 13:07, Steven Sistare wrote:

On 5/16/2024 1:24 PM, Michael Galaxy wrote:

On 5/14/24 08:54, Michael Tokarev wrote:

On 5/14/24 16:39, Michael Galaxy wrote:

Steve,

OK, so it does not look like this bugfix you wrote was included in 
8.2.4 (which was released yesterday). Unfortunately, that means 
that anyone using CPR in that release will still (eventually) 
encounter the bug like I did.


8.2.4 is basically a "bugfix" release for 8.2.3 which I somewhat
screwed up (in a minor way), plus a few currently (at the time)
queued up changes.   8.2.3 was a big release though.

I would recommend that y'all consider cherry-picking, perhaps, the 
relevant commits for a possible 8.2.5 ?


Please Cc changes which are relevant for -stable to, well,
qemu-sta...@nongnu.org :)

Which changes needs to be picked up?

Steve, can you comment here, please? At a minimum, we have this one: 
[PULL 20/25] migration: stop vm for cpr


But that pull came with a handful of other changes that are also not 
in QEMU v8, so I suspect I'm missing some other important changes 
that might be important for a stable release?


- Michael


Hi Michael, I sent the full list of commits to this distribution 
yesterday, and

I see it in my Sent email folder.  Copying verbatim:


Michael Galaxy, I'm afraid you are out of luck with respect to qemu 8.2.
It has some of the cpr reboot commits, but is missing the following:

87a2848 migration: massage cpr-reboot documentation
cbdafc1 migration: options incompatible with cpr
ce5db1c migration: update cpr-reboot description
9867d4d migration: stop vm for cpr
4af667f migration: notifier error checking
bf78a04 migration: refactor migrate_fd_connect failures
6835f5a migration: per-mode notifiers
5663dd3 migration: MigrationNotifyFunc
c763a23e migration: remove postcopy_after_devices
9d9babf migration: MigrationEvent for notifiers
3e77573 migration: convert to NotifierWithReturn
d91f33c migration: remove error from notifier data
be19d83 notify: pass error to notifier with return
b12635f migration: fix coverity migrate_mode finding
2b58a8b tests/qtest: postcopy migration with suspend
b1fdd21 tests/qtest: precopy migration with suspend
5014478 tests/qtest: option to suspend during migration
f064975 tests/qtest: migration events
49a5020 migration: preserve suspended for bg_migration
58b1057 migration: preserve suspended for snapshot
b4e9ddc migration: preserve suspended runstate
d3c86c99 migration: propagate suspended runstate
9ff5e79 cpus: vm_resume
0f1db06 cpus: check running not RUN_STATE_RUNNING
b9ae473 cpus: stop vm in suspended runstate
f06f316 cpus: vm_was_suspended

All of those landed in qemu 9.0.
---

- Steve

Re: [PATCH v3 3/5] hw/block: Add Microchip's 25CSM04 to m25p80

2024-05-17 Thread Miles Glenn



Reviewed-by: Glenn Miles 

-Glenn

On Thu, 2024-05-16 at 11:33 -0500, Chalapathi V wrote:
> Add Microchip's 25CSM04 Serial EEPROM to m25p80.  25CSM04 provides 4
> Mbits
> of Serial EEPROM utilizing the Serial Peripheral Interface (SPI)
> compatible
> bus. The device is organized as 524288 bytes of 8 bits each
> (512Kbyte) and
> is optimized for use in consumer and industrial applications where
> reliable
> and dependable nonvolatile memory storage is essential.
> 
> Signed-off-by: Chalapathi V 
> ---
>  hw/block/m25p80.c | 3 +++
>  hw/ppc/Kconfig| 1 +
>  2 files changed, 4 insertions(+)
> 
> diff --git a/hw/block/m25p80.c b/hw/block/m25p80.c
> index 8dec134832..824a6c5c60 100644
> --- a/hw/block/m25p80.c
> +++ b/hw/block/m25p80.c
> @@ -357,6 +357,9 @@ static const FlashPartInfo known_devices[] = {
>.sfdp_read = m25p80_sfdp_w25q512jv },
>  { INFO("w25q01jvq",   0xef4021,  0,  64 << 10, 2048, ER_4K),
>.sfdp_read = m25p80_sfdp_w25q01jvq },
> +
> +/* Microchip */
> +{ INFO("25csm04",  0x29cc00,  0x100,  64 << 10,  8, 0)
> },
>  };
>  
>  typedef enum {
> diff --git a/hw/ppc/Kconfig b/hw/ppc/Kconfig
> index 6f9670b377..a93430b734 100644
> --- a/hw/ppc/Kconfig
> +++ b/hw/ppc/Kconfig
> @@ -40,6 +40,7 @@ config POWERNV
>  select PCA9552
>  select PCA9554
>  select SSI
> +select SSI_M25P80
>  
>  config PPC405
>  bool

RE: [PULL 1/5] ui/console: Only declare variable fence_fd when CONFIG_GBM is defined

2024-05-17 Thread Kim, Dongwon

Thanks and sorry for missing this in the original commit.

Acked-by: Dongwon Kim 

> -Original Message-
> From: Philippe Mathieu-Daudé 
> Sent: Friday, May 17, 2024 8:02 AM
> To: qemu-devel@nongnu.org
> Cc: Cédric Le Goater ; Kim, Dongwon
> ; Marc-André Lureau
> ; Philippe Mathieu-Daudé
> 
> Subject: [PULL 1/5] ui/console: Only declare variable fence_fd when
> CONFIG_GBM is defined
> 
> From: Cédric Le Goater 
> 
> This to avoid a build breakage :
> 
> ../ui/gtk-egl.c: In function ‘gd_egl_draw’:
> ../ui/gtk-egl.c:73:9: error: unused variable ‘fence_fd’ 
> [-Werror=unused-variable]
>73 | int fence_fd;
>   | ^~~~
> 
> Fixes: fa6426805b12 ("ui/console: Use qemu_dmabuf_set_..() helpers instead")
> Cc: Dongwon Kim 
> Cc: Marc-André Lureau 
> Signed-off-by: Cédric Le Goater 
> Reviewed-by: Philippe Mathieu-Daudé 
> Message-ID: <20240515100520.574383-1-...@redhat.com>
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  ui/gtk-egl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c index 0473f689c9..9831c10e1b 100644
> --- a/ui/gtk-egl.c
> +++ b/ui/gtk-egl.c
> @@ -68,9 +68,9 @@ void gd_egl_draw(VirtualConsole *vc)
>  GdkWindow *window;
>  #ifdef CONFIG_GBM
>  QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf;
> +int fence_fd;
>  #endif
>  int ww, wh, ws;
> -int fence_fd;
> 
>  if (!vc->gfx.gls) {
>  return;
> --
> 2.41.0

Re: [PATCH v3 5/5] tests/qtest: Add pnv-spi-seeprom qtest

2024-05-17 Thread Miles Glenn

Hi Chalapathi,

Looks good.  I think I would just shorten the names of the xscom
read/write functions to make things more readable inside the
transaction function.

-Glenn

Reviewed-by: Glenn Miles 

> +static uint64_t pnv_spi_seeprom_xscom_addr(uint32_t reg)
> +{
> +return pnv_xscom_addr(SPIC2_XSCOM_BASE + reg);
> +}
> +
> +static void pnv_spi_controller_xscom_write(QTestState *qts, uint32_t
> reg,
> +uint64_t val)
> +{
> +qtest_writeq(qts, pnv_spi_seeprom_xscom_addr(reg), val);
> +}
> +
> +static uint64_t pnv_spi_controller_xscom_read(QTestState *qts,
> uint32_t reg)
> +{
> +return qtest_readq(qts, pnv_spi_seeprom_xscom_addr(reg));
> +}
> +
> +static void spi_seeprom_transaction(QTestState *qts)
> +{
> +/* SPI transactions to SEEPROM to read from SEEPROM image */
> +pnv_spi_controller_xscom_write(qts, COUNTER_CONFIG_REG,
> +READ_OP_COUNTER_CONFIG);
> +pnv_spi_controller_xscom_write(qts, SEQUENCER_OPERATION_REG,
> +READ_OP_SEQUENCER);
> +pnv_spi_controller_xscom_write(qts, TRANSMIT_DATA_REG,
> READ_OP_TDR_DATA);
> +pnv_spi_controller_xscom_write(qts, TRANSMIT_DATA_REG, 0);
> +/* Read 5*8 bytes from SEEPROM at 0x100 */
> +uint64_t rdr_val = pnv_spi_controller_xscom_read(qts,
> RECEIVE_DATA_REG);
> +printf("RDR READ = 0x%lx\n", rdr_val);
> +rdr_val = pnv_spi_controller_xscom_read(qts, RECEIVE_DATA_REG);
> +rdr_val = pnv_spi_controller_xscom_read(qts, RECEIVE_DATA_REG);
> +rdr_val = pnv_spi_controller_xscom_read(qts, RECEIVE_DATA_REG);
> +rdr_val = pnv_spi_controller_xscom_read(qts, RECEIVE_DATA_REG);
> +printf("RDR READ = 0x%lx\n", rdr_val);
> +
> +/* SPI transactions to SEEPROM to write to SEEPROM image */
> +pnv_spi_controller_xscom_write(qts, COUNTER_CONFIG_REG,
> +WRITE_OP_COUNTER_CONFIG);
> +/* Set Write Enable Latch bit of status0 register */
> +pnv_spi_controller_xscom_write(qts, SEQUENCER_OPERATION_REG,
> +WRITE_OP_SEQUENCER);
> +pnv_spi_controller_xscom_write(qts, TRANSMIT_DATA_REG,
> WRITE_OP_WREN);
> +/* write 8 bytes to SEEPROM at 0x100 */
> +pnv_spi_controller_xscom_write(qts, SEQUENCER_OPERATION_REG,
> +WRITE_OP_SEQUENCER);
> +pnv_spi_controller_xscom_write(qts, TRANSMIT_DATA_REG,
> WRITE_OP_TDR_DATA);
> +}
> +
> +/* Find complete path of in_file in the current working directory */
> +static void find_file(const char *in_file, char *in_path)
> +{
> +g_autofree char *cwd = g_get_current_dir();
> +char *filepath = g_build_filename(cwd, in_file, NULL);
> +if (!access(filepath, F_OK)) {
> +strcpy(in_path, filepath);
> +} else {
> +strcpy(in_path, "");
> +printf("File %s not found within %s\n", in_file, cwd);
> +}
> +}
> +
> +static void test_spi_seeprom(void)
> +{
> +QTestState *qts = NULL;
> +char seepromfile[500];
> +find_file("sbe_measurement_seeprom.bin.ecc", seepromfile);
> +if (strcmp(seepromfile, "")) {
> +printf("Starting QEMU with seeprom file.\n");
> +qts = qtest_initf("-m 2G -machine powernv10 -smp 2,cores=2,"
> +  "threads=1 -accel tcg,thread=single
> -nographic "
> +  "-blockdev node-
> name=pib_spic2,driver=file,"
> +   "filename=sbe_measurement_seeprom.bin.ecc "
> +   "-device 25csm04,bus=pnv-spi-bus.2,cs=0,"
> +   "drive=pib_spic2");
> +} else {
> +printf("Starting QEMU without seeprom file.\n");
> +qts = qtest_initf("-m 2G -machine powernv10 -smp 2,cores=2,"
> +  "threads=1 -accel tcg,thread=single
> -nographic"
> +   " -device 25csm04,bus=pnv-spi-bus.2,cs=0");
> +}
> +spi_seeprom_transaction(qts);
> +qtest_quit(qts);
> +}
> +
> +int main(int argc, char **argv)
> +{
> +g_test_init(, , NULL);
> +qtest_add_func("spi_seeprom", test_spi_seeprom);
> +return g_test_run();
> +}
> diff --git a/tests/qtest/meson.build b/tests/qtest/meson.build
> index 86293051dc..2fa98b2430 100644
> --- a/tests/qtest/meson.build
> +++ b/tests/qtest/meson.build
> @@ -171,6 +171,7 @@ qtests_ppc64 = \
>qtests_ppc + \
>(config_all_devices.has_key('CONFIG_PSERIES') ? ['device-plug-
> test'] : []) +   \
>(config_all_devices.has_key('CONFIG_POWERNV') ? ['pnv-xscom-test'] 
> : []) + \
> +  (config_all_devices.has_key('CONFIG_POWERNV') ? ['pnv-spi-seeprom-
> test'] : []) +   \
>(config_all_devices.has_key('CONFIG_POWERNV') ? ['pnv-host-i2c-
> test'] : []) +  \
>(config_all_devices.has_key('CONFIG_PSERIES') ? ['rtas-test'] :
> []) +  \
>(slirp.found() ? ['pxe-test'] : []) +  \

Re: [PATCH 1/1] block: drop force_dup parameter of raw_reconfigure_getfd()

2024-05-17 Thread Denis V. Lunev


On 4/30/24 19:02, Denis V. Lunev wrote:

This parameter is always passed as 'false' from the caller.

Signed-off-by: Denis V. Lunev 
CC: Andrey Zhadchenko 
CC: Kevin Wolf 
CC: Hanna Reitz 
---
  block/file-posix.c | 8 +++-
  1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 35684f7e21..5c46938936 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -1039,8 +1039,7 @@ static int fcntl_setfl(int fd, int flag)
  }
  
  static int raw_reconfigure_getfd(BlockDriverState *bs, int flags,

- int *open_flags, uint64_t perm, bool 
force_dup,
- Error **errp)
+ int *open_flags, uint64_t perm, Error **errp)
  {
  BDRVRawState *s = bs->opaque;
  int fd = -1;
@@ -1068,7 +1067,7 @@ static int raw_reconfigure_getfd(BlockDriverState *bs, 
int flags,
  assert((s->open_flags & O_ASYNC) == 0);
  #endif
  
-if (!force_dup && *open_flags == s->open_flags) {

+if (*open_flags == s->open_flags) {
  /* We're lucky, the existing fd is fine */
  return s->fd;
  }
@@ -3748,8 +3747,7 @@ static int raw_check_perm(BlockDriverState *bs, uint64_t 
perm, uint64_t shared,
  int ret;
  
  /* We may need a new fd if auto-read-only switches the mode */

-ret = raw_reconfigure_getfd(bs, input_flags, _flags, perm,
-false, errp);
+ret = raw_reconfigure_getfd(bs, input_flags, _flags, perm, errp);
  if (ret < 0) {
  return ret;
  } else if (ret != s->fd) {

ping

Re: [PATCH 1/1] prealloc: add truncate mode for prealloc filter

2024-05-17 Thread Denis V. Lunev


On 4/30/24 19:05, Denis V. Lunev wrote:

Preallocate filter allows to implement really interesting setups.

Assume that we have
* shared block device, f.e. iSCSI LUN, implemented with some HW device
* clustered LVM on top of it
* QCOW2 image stored inside LVM volume

This allows very cheap clustered setups with all QCOW2 features intact.
Currently supported setups using QCOW2 with data_file option are not
so cool as snapshots are not allowed, QCOW2 should be placed into some
additional distributed storage and so on.

Though QCOW2 inside LVM volume has a drawback. The image is growing and
in order to accomodate that image LVM volume is to be resized. This
could be done externally using ENOSPACE event/condition but this is
cumbersome.

This patch introduces native implementation for such a setup. We should
just put prealloc filter in between QCOW2 format and file nodes. In that
case LVM will be resized at proper moment and that is done effectively
as resizing is done in chinks.

The patch adds allocation mode for this purpose in order to distinguish
'fallocate' for ordinary file system and 'truncate'.

Signed-off-by: Denis V. Lunev 
CC: Alexander Ivanov 
CC: Kevin Wolf 
CC: Hanna Reitz 
CC: Vladimir Sementsov-Ogievskiy 
---
  block/preallocate.c | 50 +++--
  1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/block/preallocate.c b/block/preallocate.c
index 4d82125036..6d31627325 100644
--- a/block/preallocate.c
+++ b/block/preallocate.c
@@ -33,10 +33,24 @@
  #include "block/block-io.h"
  #include "block/block_int.h"
  
+typedef enum PreallocateMode {

+PREALLOCATE_MODE_FALLOCATE = 0,
+PREALLOCATE_MODE_TRUNCATE = 1,
+PREALLOCATE_MODE__MAX = 2,
+} PreallocateMode;
+
+static QEnumLookup prealloc_mode_lookup = {
+.array = (const char *const[]) {
+"falloc",
+"truncate",
+},
+.size = PREALLOCATE_MODE__MAX,
+};
  
  typedef struct PreallocateOpts {

  int64_t prealloc_size;
  int64_t prealloc_align;
+PreallocateMode prealloc_mode;
  } PreallocateOpts;
  
  typedef struct BDRVPreallocateState {

@@ -79,6 +93,7 @@ typedef struct BDRVPreallocateState {
  
  #define PREALLOCATE_OPT_PREALLOC_ALIGN "prealloc-align"

  #define PREALLOCATE_OPT_PREALLOC_SIZE "prealloc-size"
+#define PREALLOCATE_OPT_MODE "mode"
  static QemuOptsList runtime_opts = {
  .name = "preallocate",
  .head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
@@ -94,7 +109,14 @@ static QemuOptsList runtime_opts = {
  .type = QEMU_OPT_SIZE,
  .help = "how much to preallocate, default 128M",
  },
-{ /* end of list */ }
+{
+.name = PREALLOCATE_OPT_MODE,
+.type = QEMU_OPT_STRING,
+.help = "Preallocation mode on image expansion "
+"(allowed values: falloc, truncate)",
+.def_value_str = "falloc",
+},
+{ /* end of list */ },
  },
  };
  
@@ -102,6 +124,8 @@ static bool preallocate_absorb_opts(PreallocateOpts *dest, QDict *options,

  BlockDriverState *child_bs, Error **errp)
  {
  QemuOpts *opts = qemu_opts_create(_opts, NULL, 0, _abort);
+Error *local_err = NULL;
+char *buf;
  
  if (!qemu_opts_absorb_qdict(opts, options, errp)) {

  return false;
@@ -112,6 +136,17 @@ static bool preallocate_absorb_opts(PreallocateOpts *dest, 
QDict *options,
  dest->prealloc_size =
  qemu_opt_get_size(opts, PREALLOCATE_OPT_PREALLOC_SIZE, 128 * MiB);
  
+buf = qemu_opt_get_del(opts, PREALLOCATE_OPT_MODE);

+/* prealloc_mode can be downgraded later during allocate_clusters */
+dest->prealloc_mode = qapi_enum_parse(_mode_lookup, buf,
+  PREALLOCATE_MODE_FALLOCATE,
+  _err);
+g_free(buf);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+return false;
+}
+
  qemu_opts_del(opts);
  
  if (!QEMU_IS_ALIGNED(dest->prealloc_align, BDRV_SECTOR_SIZE)) {

@@ -335,9 +370,20 @@ handle_write(BlockDriverState *bs, int64_t offset, int64_t 
bytes,
  
  want_merge_zero = want_merge_zero && (prealloc_start <= offset);
  
-ret = bdrv_co_pwrite_zeroes(

+switch (s->opts.prealloc_mode) {
+case PREALLOCATE_MODE_FALLOCATE:
+ret = bdrv_co_pwrite_zeroes(
  bs->file, prealloc_start, prealloc_end - prealloc_start,
  BDRV_REQ_NO_FALLBACK | BDRV_REQ_SERIALISING | BDRV_REQ_NO_WAIT);
+break;
+case PREALLOCATE_MODE_TRUNCATE:
+ret = bdrv_co_truncate(bs->file, prealloc_end, false,
+   PREALLOC_MODE_OFF, 0, NULL);
+break;
+default:
+return false;
+}
+
  if (ret < 0) {
  s->file_end = ret;
  return false;

ping

Re: [PATCH v3 1/5] ppc/pnv: Add SPI controller model

2024-05-17 Thread Miles Glenn

Hi Chalapathi,

Looks good.  Just some suggestions on readability and some
simplifications (see below).

Thanks,

Glenn

On Thu, 2024-05-16 at 11:33 -0500, Chalapathi V wrote:
> SPI controller device model supports a connection to a single SPI
> responder.
> This provide access to SPI seeproms, TPM, flash device and an ADC
> controller.
> 
> All SPI function control is mapped into the SPI register space to
> enable full
> control by firmware. In this commit SPI configuration component is
> modelled
> which contains all SPI configuration and status registers as well as
> the hold
> registers for data to be sent or having been received.
> 
> An existing QEMU SSI framework is used and SSI_BUS is created.
> 
> Signed-off-by: Chalapathi V 
> ---
>  include/hw/ppc/pnv_xscom.h|   3 +
>  include/hw/ssi/pnv_spi.h  |  44 +++
>  include/hw/ssi/pnv_spi_regs.h | 114 +
>  hw/ppc/pnv_spi_controller.c   | 228
> ++
>  hw/ppc/Kconfig|   1 +
>  hw/ppc/meson.build|   1 +
>  hw/ppc/trace-events   |   6 +
>  7 files changed, 397 insertions(+)
>  create mode 100644 include/hw/ssi/pnv_spi.h
>  create mode 100644 include/hw/ssi/pnv_spi_regs.h
>  create mode 100644 hw/ppc/pnv_spi_controller.c
> 
> diff --git a/include/hw/ppc/pnv_xscom.h b/include/hw/ppc/pnv_xscom.h
> index 6209e18492..a77b97f9b1 100644
> --- a/include/hw/ppc/pnv_xscom.h
> +++ b/include/hw/ppc/pnv_xscom.h
> @@ -194,6 +194,9 @@ struct PnvXScomInterfaceClass {
>  #define PNV10_XSCOM_PEC_PCI_BASE   0x8010800 /* index goes upwards
> ... */
>  #define PNV10_XSCOM_PEC_PCI_SIZE   0x200
>  
> +#define PNV10_XSCOM_PIB_SPIC_BASE 0xc
> +#define PNV10_XSCOM_PIB_SPIC_SIZE 0x20
> +
>  void pnv_xscom_init(PnvChip *chip, uint64_t size, hwaddr addr);
>  int pnv_dt_xscom(PnvChip *chip, void *fdt, int root_offset,
>   uint64_t xscom_base, uint64_t xscom_size,
> diff --git a/include/hw/ssi/pnv_spi.h b/include/hw/ssi/pnv_spi.h
> new file mode 100644
> index 00..244ee1cfc0
> --- /dev/null
> +++ b/include/hw/ssi/pnv_spi.h
> @@ -0,0 +1,44 @@
> +/*
> + * QEMU PowerPC SPI Controller model
> + *
> + * Copyright (c) 2024, IBM Corporation.
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + *
> + * This model Supports a connection to a single SPI responder.
> + * Introduced for P10 to provide access to SPI seeproms, TPM, flash
> device
> + * and an ADC controller.
> + */
> +#include "hw/ssi/ssi.h"
> +
> +#ifndef PPC_PNV_SPI_CONTROLLER_H
> +#define PPC_PNV_SPI_CONTROLLER_H
> +
> +#define TYPE_PNV_SPI_CONTROLLER "pnv-spi-controller"
> +#define PNV_SPICONTROLLER(obj) \
> +OBJECT_CHECK(PnvSpiController, (obj),
> TYPE_PNV_SPI_CONTROLLER)
> +
> +#define SPI_CONTROLLER_REG_SIZE 8
> +
> +#define TYPE_PNV_SPI_BUS "pnv-spi-bus"
> +typedef struct PnvSpiController {
> +SysBusDevice parent_obj;
> +
> +SSIBus *ssi_bus;
> +qemu_irq *cs_line;
> +MemoryRegionxscom_spic_regs;
> +/* SPI controller object number */
> +uint32_tspic_num;
> +

Would it be better to make these into an array of registers?  It would
probably simplify your read/write functions.

> +/* SPI Controller registers */
> +uint64_terror_reg;
> +uint64_tcounter_config_reg;
> +uint64_tconfig_reg1;
> +uint64_tclock_config_reset_control;
> +uint64_tmemory_mapping_reg;
> +uint64_ttransmit_data_reg;
> +uint64_treceive_data_reg;
> +uint8_t sequencer_operation_reg[SPI_CONTROLLER_REG_SIZE]
> ;
> +uint64_tstatus_reg;
> +} PnvSpiController;
> +#endif /* PPC_PNV_SPI_CONTROLLER_H */
> diff --git a/include/hw/ssi/pnv_spi_regs.h
> b/include/hw/ssi/pnv_spi_regs.h
> new file mode 100644
> index 00..6f613aca5e
> --- /dev/null
> +++ b/include/hw/ssi/pnv_spi_regs.h
> @@ -0,0 +1,114 @@
> +/*
> + * QEMU PowerPC SPI Controller model
> + *
> + * Copyright (c) 2023, IBM Corporation.
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#ifndef SPI_CONTROLLER_REGS_H
> +#define SPI_CONTROLLER_REGS_H
> +

In order to improve readability, I think all of these register/field
names should be shortened so that code fits into the 80 char limit more
easily.  I think they should also be prefixed with SPI_*.  Probably
don't need the "REG" in the middle of the field names either, just keep
it in the register names?  Possible suggestions follow to give a better
idea of what I'm talking about...

> +/* Error Register */
> +#define ERROR_REG   0x00
> +

I would change this to something like SPI_CTR_CFG_REG.

> +/* counter_config_reg */
> +#define COUNTER_CONFIG_REG  0x01

SPI_CTR_CFG_N1/N2
> +#define COUNTER_CONFIG_REG_SHIFT_COUNT_N1   PPC_BITMASK(0, 7)
> +#define COUNTER_CONFIG_REG_SHIFT_COUNT_N2   PPC_BITMASK(8, 15)

SPI_CTR_CFG_CMP1/CMP2
> +#define COUNTER_CONFIG_REG_COUNT_COMPARE1   PPC_BITMASK(24, 31)
> +#define

Re: [PATCH v7 00/12] Enabling DCD emulation support in Qemu

2024-05-17 Thread fan

On Fri, May 17, 2024 at 01:18:52PM +0100, Jonathan Cameron wrote:
> On Thu, 16 May 2024 10:05:33 -0700
> fan  wrote:
> 
> > On Fri, Apr 19, 2024 at 02:24:36PM -0400, Gregory Price wrote:
> > > On Thu, Apr 18, 2024 at 04:10:51PM -0700, nifan@gmail.com wrote:  
> > > > A git tree of this series can be found here (with one extra commit on 
> > > > top
> > > > for printing out accepted/pending extent list): 
> > > > https://github.com/moking/qemu/tree/dcd-v7
> > > > 
> > > > v6->v7:
> > > > 
> > > > 1. Fixed the dvsec range register issue mentioned in the the cover 
> > > > letter in v6.
> > > >Only relevant bits are set to mark the device ready (Patch 6). 
> > > > (Jonathan)
> > > > 2. Moved the if statement in cxl_setup_memory from Patch 6 to Patch 4. 
> > > > (Jonathan)
> > > > 3. Used MIN instead of if statement to get record_count in Patch 7. 
> > > > (Jonathan)
> > > > 4. Added "Reviewed-by" tag to Patch 7.
> > > > 5. Modified cxl_dc_extent_release_dry_run so the updated extent list 
> > > > can be
> > > >reused in cmd_dcd_release_dyn_cap to simplify the process in Patch 
> > > > 8. (Jørgen) 
> > > > 6. Added comments to indicate further "TODO" items in 
> > > > cmd_dcd_add_dyn_cap_rsp.
> > > > (Jonathan)
> > > > 7. Avoided irrelevant code reformat in Patch 8. (Jonathan)
> > > > 8. Modified QMP interfaces for adding/releasing DC extents to allow 
> > > > passing
> > > >tags, selection policy, flags in the interface. (Jonathan, Gregory)
> > > > 9. Redesigned the pending list so extents in the same requests are 
> > > > grouped
> > > > together. A new data structure is introduced to represent "extent 
> > > > group"
> > > > in pending list.  (Jonathan)
> > > > 10. Added support in QMP interface for "More" flag. 
> > > > 11. Check "Forced removal" flag for release request and not let it pass 
> > > > through.
> > > > 12. Removed the dynamic capacity log type from CxlEventLog definition 
> > > > in cxl.json
> > > >to avoid the side effect it may introduce to inject error to DC 
> > > > event log.
> > > >(Jonathan)
> > > > 13. Hard coded the event log type to dynamic capacity event log in QMP
> > > > interfaces. (Jonathan)
> > > > 14. Adding space in between "-1]". (Jonathan)
> > > > 15. Some minor comment fixes.
> > > > 
> > > > The code is tested with similar setup and has passed similar tests as 
> > > > listed
> > > > in the cover letter of v5[1] and v6[2].
> > > > Also, the code is tested with the latest DCD kernel patchset[3].
> > > > 
> > > > [1] Qemu DCD patchset v5: 
> > > > https://lore.kernel.org/linux-cxl/20240304194331.1586191-1-nifan@gmail.com/T/#t
> > > > [2] Qemu DCD patchset v6: 
> > > > https://lore.kernel.org/linux-cxl/20240325190339.696686-1-nifan@gmail.com/T/#t
> > > > [3] DCD kernel patches: 
> > > > https://lore.kernel.org/linux-cxl/20240324-dcd-type2-upstream-v1-0-b7b00d623...@intel.com/T/#m11c571e21c4fe17c7d04ec5c2c7bc7cbf2cd07e3
> > > >  
> > > 
> > > added review to all patches, will hopefully be able to add a Tested-by
> > > tag early next week, along with a v1 RFC for MHD bit-tracking.
> > > 
> > > We've been testing v5/v6 for a bit, so I expect as soon as we get the
> > > MHD code ported over to v7 i'll ship a tested-by tag pretty quick.
> > > 
> > > The super-set release will complicate a few things but this doesn't
> > > look like a blocker on our end, just a change to how we track bits in a
> > > shared bit/bytemap.
> > >   
> > 
> > Hi Gregory,
> > I am planning to address all the concerns in this series and send out v8
> > next week. Jonathan mentioned you have few related patches built on top
> > of this series, can you point me to the latest version so I can look
> > into it? Also, would you like me to carry them over to send together
> > with my series in next version? It could be easier for you to avoid the
> > potential rebase needed for your patches?
> 
> I wasn't clear - I meant other way around.
> This series is built on a couple of Gregory's patches.  Gregory can suffer
> the pain of rebasing his stuff ;) (or I'll do it depending on when things
> land).
> 
> hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference 
> https://gitlab.com/jic23/qemu/-/commit/f44ebc5a455ccdd6535879b0c5824e0d76b04da5
> hw/cxl/mailbox: interface to add CCI commands to an existing CCI 
> https://gitlab.com/jic23/qemu/-/commit/00a4dd8b388add03c588298f665ee918626296a5
> 
> I was suggesting your next posting should just include those two with
> your sign-off added. That way if everyone is happy with v8 Michael Tsirkin
> can pick it up directly, saving a step.
> 
> Make sure to add Michael to the to list as well for next version.
> 
> Thanks,
> 
> Jonathan

Oh, I totally mis-understood.

Sure. I will add the two patches with my sign-off to my next post.

Fan
> 
> > 
> > Let me know.
> > 
> > Thanks,
> > Fan
> > 
> > > > 
> > > > Fan Ni (12):
> > > >   hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
> > > >

[PULL 2/5] hw/pflash: fix block write start

2024-05-17 Thread Philippe Mathieu-Daudé

From: Gerd Hoffmann 

Move the pflash_blk_write_start() call.  We need the offset of the
first data write, not the offset for the setup (number-of-bytes)
write.  Without this fix u-boot can do block writes to the first
flash block only.

While being at it drop a leftover FIXME.

Cc: qemu-sta...@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2343
Fixes: 284a7ee2e290 ("hw/pflash: implement update buffer for block writes")
Signed-off-by: Gerd Hoffmann 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240516121237.534875-1-kra...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/block/pflash_cfi01.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
index 1bda8424b9..c8f1cf5a87 100644
--- a/hw/block/pflash_cfi01.c
+++ b/hw/block/pflash_cfi01.c
@@ -518,10 +518,6 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
 break;
 case 0xe8: /* Write to buffer */
 trace_pflash_write(pfl->name, "write to buffer");
-/* FIXME should save @offset, @width for case 1+ */
-qemu_log_mask(LOG_UNIMP,
-  "%s: Write to buffer emulation is flawed\n",
-  __func__);
 pfl->status |= 0x80; /* Ready! */
 break;
 case 0xf0: /* Probe for AMD flash */
@@ -574,7 +570,6 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
 }
 pfl->counter = value;
 pfl->wcycle++;
-pflash_blk_write_start(pfl, offset);
 break;
 case 0x60:
 if (cmd == 0xd0) {
@@ -605,6 +600,9 @@ static void pflash_write(PFlashCFI01 *pfl, hwaddr offset,
 switch (pfl->cmd) {
 case 0xe8: /* Block write */
 /* FIXME check @offset, @width */
+if (pfl->blk_offset == -1 && pfl->counter) {
+pflash_blk_write_start(pfl, offset);
+}
 if (!pfl->ro && (pfl->blk_offset != -1)) {
 pflash_data_write(pfl, offset, value, width, be);
 } else {
-- 
2.41.0

[PULL 4/5] tests: add testing of parameter=1 for SMP topology

2024-05-17 Thread Philippe Mathieu-Daudé

From: Daniel P. Berrangé 

Validate that it is possible to pass 'parameter=1' for any SMP topology
parameter, since unsupported parameters are implicitly considered to
always have a value of 1.

Signed-off-by: Daniel P. Berrangé 
Reviewed-by: Zhao Liu 
Reviewed-by: Ján Tomko 
Message-ID: <20240513123358.612355-3-berra...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/unit/test-smp-parse.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tests/unit/test-smp-parse.c b/tests/unit/test-smp-parse.c
index 56165e6644..9fdba24fce 100644
--- a/tests/unit/test-smp-parse.c
+++ b/tests/unit/test-smp-parse.c
@@ -330,6 +330,14 @@ static const struct SMPTestData data_generic_valid[] = {
 .config = SMP_CONFIG_GENERIC(T, 8, T, 2, T, 4, T, 2, T, 16),
 .expect_prefer_sockets = CPU_TOPOLOGY_GENERIC(8, 2, 4, 2, 16),
 .expect_prefer_cores   = CPU_TOPOLOGY_GENERIC(8, 2, 4, 2, 16),
+}, {
+/*
+ * Unsupported parameters are always allowed to be set to '1'
+ * config: -smp 
8,books=1,drawers=1,sockets=2,modules=1,dies=1,cores=2,threads=2,maxcpus=8
+ * expect: cpus=8,sockets=2,cores=2,threads=2,maxcpus=8 */
+.config = SMP_CONFIG_WITH_FULL_TOPO(8, 1, 1, 2, 1, 1, 2, 2, 8),
+.expect_prefer_sockets = CPU_TOPOLOGY_GENERIC(8, 2, 2, 2, 8),
+.expect_prefer_cores   = CPU_TOPOLOGY_GENERIC(8, 2, 2, 2, 8),
 },
 };
 
-- 
2.41.0

[PULL 0/5] Misc HW patches & fixes for 2024-05-17

2024-05-17 Thread Philippe Mathieu-Daudé

WARNING & ERROR from checkpatch.pl in tests/unit/test-smp-parse.c
deliberately ignored.

The following changes since commit 85ef20f1673feaa083f4acab8cf054df77b0dbed:

  Merge tag 'pull-maintainer-may24-160524-2' of https://gitlab.com/stsquad/qemu 
into staging (2024-05-16 10:02:56 +0200)

are available in the Git repository at:

  https://github.com/philmd/qemu.git tags/hw-misc-20240517

for you to fetch changes up to 93a3048dcf4565c73f2aa1d751f7197e296f1f1f:

  tests: Gently exit from GDB when tests complete (2024-05-17 16:49:04 +0200)


Misc HW patches queue

- Fix build when GBM buffer management library is detected (Cédric)
- Fix PFlash block write (Gerd)
- Allow 'parameter=1' for SMP topology on any machine (Daniel)
- Allow guest-debug tests to run with recent GDB (Gustavo)



Cédric Le Goater (1):
  ui/console: Only declare variable fence_fd when CONFIG_GBM is defined

Daniel P. Berrangé (2):
  hw/core: allow parameter=1 for SMP topology on any machine
  tests: add testing of parameter=1 for SMP topology

Gerd Hoffmann (1):
  hw/pflash: fix block write start

Gustavo Romero (1):
  tests: Gently exit from GDB when tests complete

 hw/block/pflash_cfi01.c   |  8 ++-
 hw/core/machine-smp.c | 84 ++-
 tests/unit/test-smp-parse.c   | 16 --
 ui/gtk-egl.c  |  2 +-
 tests/guest-debug/test_gdbstub.py |  2 +-
 5 files changed, 44 insertions(+), 68 deletions(-)

-- 
2.41.0

[PULL 5/5] tests: Gently exit from GDB when tests complete

2024-05-17 Thread Philippe Mathieu-Daudé

From: Gustavo Romero 

GDB commit a207f6b3a38 ('Rewrite "python" command exception handling')
changed how exit() called from Python scripts loaded by GDB behave,
turning it into an exception instead of a generic error code that is
returned. This change caused several QEMU tests to crash with the
following exception:

Python Exception : 0
Error occurred in Python: 0

This happens because in tests/guest-debug/test_gdbstub.py exit is
called after the tests have completed.

This commit fixes it by politely asking GDB to exit via gdb.execute,
passing the proper fail_count to be reported to 'make', instead of
abruptly calling exit() from the Python script.

Signed-off-by: Gustavo Romero 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240515173132.2462201-4-gustavo.rom...@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/guest-debug/test_gdbstub.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/guest-debug/test_gdbstub.py 
b/tests/guest-debug/test_gdbstub.py
index 7f71d34da1..46fbf98f0c 100644
--- a/tests/guest-debug/test_gdbstub.py
+++ b/tests/guest-debug/test_gdbstub.py
@@ -57,4 +57,4 @@ def main(test, expected_arch=None):
 pass
 
 print("All tests complete: {} failures".format(fail_count))
-exit(fail_count)
+gdb.execute(f"exit {fail_count}")
-- 
2.41.0

[PULL 3/5] hw/core: allow parameter=1 for SMP topology on any machine

2024-05-17 Thread Philippe Mathieu-Daudé

From: Daniel P. Berrangé 

This effectively reverts

  commit 54c4ea8f3ae614054079395842128a856a73dbf9
  Author: Zhao Liu 
  Date:   Sat Mar 9 00:01:37 2024 +0800

hw/core/machine-smp: Deprecate unsupported "parameter=1" SMP configurations

but is not done as a 'git revert' since the part of the changes to the
file hw/core/machine-smp.c which add 'has_XXX' checks remain desirable.
Furthermore, we have to tweak the subsequently added unit test to
account for differing warning message.

The rationale for the original deprecation was:

  "Currently, it was allowed for users to specify the unsupported
   topology parameter as "1". For example, x86 PC machine doesn't
   support drawer/book/cluster topology levels, but user could specify
   "-smp drawers=1,books=1,clusters=1".

   This is meaningless and confusing, so that the support for this kind
   of configurations is marked deprecated since 9.0."

There are varying POVs on the topic of 'unsupported' topology levels.

It is common to say that on a system without hyperthreading, that there
is always 1 thread. Likewise when new CPUs introduced a concept of
multiple "dies', it was reasonable to say that all historical CPUs
before that implicitly had 1 'die'. Likewise for the more recently
introduced 'modules' and 'clusters' parameter'. From this POV, it is
valid to set 'parameter=1' on the -smp command line for any machine,
only a value > 1 is strictly an error condition.

It doesn't cause any functional difficulty for QEMU, because internally
the QEMU code is itself assuming that all "unsupported" parameters
implicitly have a value of '1'.

At the libvirt level, we've allowed applications to set 'parameter=1'
when configuring a guest, and pass that through to QEMU.

Deprecating this creates extra difficulty for because there's no info
exposed from QEMU about which machine types "support" which parameters.
Thus, libvirt can't know whether it is valid to pass 'parameter=1' for
a given machine type, or whether it will trigger deprecation messages.

Since there's no apparent functional benefit to deleting this deprecated
behaviour from QEMU, and it creates problems for consumers of QEMU,
remove this deprecation.

Signed-off-by: Daniel P. Berrangé 
Reviewed-by: Zhao Liu 
Reviewed-by: Ján Tomko 
Message-ID: <20240513123358.612355-2-berra...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 hw/core/machine-smp.c   | 84 -
 tests/unit/test-smp-parse.c |  8 ++--
 2 files changed, 31 insertions(+), 61 deletions(-)

diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c
index 2b93fa99c9..5d8d7edcbd 100644
--- a/hw/core/machine-smp.c
+++ b/hw/core/machine-smp.c
@@ -118,76 +118,46 @@ void machine_parse_smp_config(MachineState *ms,
 }

 /*
- * If not supported by the machine, a topology parameter must be
- * omitted.
+ * If not supported by the machine, a topology parameter must
+ * not be set to a value greater than 1.
  */
-if (!mc->smp_props.modules_supported && config->has_modules) {
-if (config->modules > 1) {
-error_setg(errp, "modules not supported by this "
-   "machine's CPU topology");
-return;
-} else {
-/* Here modules only equals 1 since we've checked zero case. */
-warn_report("Deprecated CPU topology (considered invalid): "
-"Unsupported modules parameter mustn't be "
-"specified as 1");
-}
+if (!mc->smp_props.modules_supported &&
+config->has_modules && config->modules > 1) {
+error_setg(errp,
+   "modules > 1 not supported by this machine's CPU topology");
+return;
 }
 modules = modules > 0 ? modules : 1;

-if (!mc->smp_props.clusters_supported && config->has_clusters) {
-if (config->clusters > 1) {
-error_setg(errp, "clusters not supported by this "
-   "machine's CPU topology");
-return;
-} else {
-/* Here clusters only equals 1 since we've checked zero case. */
-warn_report("Deprecated CPU topology (considered invalid): "
-"Unsupported clusters parameter mustn't be "
-"specified as 1");
-}
+if (!mc->smp_props.clusters_supported &&
+config->has_clusters && config->clusters > 1) {
+error_setg(errp,
+   "clusters > 1 not supported by this machine's CPU 
topology");
+return;
 }
 clusters = clusters > 0 ? clusters : 1;

-if (!mc->smp_props.dies_supported && config->has_dies) {
-if (config->dies > 1) {
-error_setg(errp, "dies not supported by this "
-   "machine's CPU topology");
-return;
-} else {
-/* Here dies only equals 1 since we've checked zero case. */
-warn_report("Deprecated CPU topology

[PULL 1/5] ui/console: Only declare variable fence_fd when CONFIG_GBM is defined

2024-05-17 Thread Philippe Mathieu-Daudé

From: Cédric Le Goater 

This to avoid a build breakage :

../ui/gtk-egl.c: In function ‘gd_egl_draw’:
../ui/gtk-egl.c:73:9: error: unused variable ‘fence_fd’ 
[-Werror=unused-variable]
   73 | int fence_fd;
  | ^~~~

Fixes: fa6426805b12 ("ui/console: Use qemu_dmabuf_set_..() helpers instead")
Cc: Dongwon Kim 
Cc: Marc-André Lureau 
Signed-off-by: Cédric Le Goater 
Reviewed-by: Philippe Mathieu-Daudé 
Message-ID: <20240515100520.574383-1-...@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé 
---
 ui/gtk-egl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ui/gtk-egl.c b/ui/gtk-egl.c
index 0473f689c9..9831c10e1b 100644
--- a/ui/gtk-egl.c
+++ b/ui/gtk-egl.c
@@ -68,9 +68,9 @@ void gd_egl_draw(VirtualConsole *vc)
 GdkWindow *window;
 #ifdef CONFIG_GBM
 QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf;
+int fence_fd;
 #endif
 int ww, wh, ws;
-int fence_fd;
 
 if (!vc->gfx.gls) {
 return;
-- 
2.41.0

Re: [PATCH] hw/intc/s390_flic: Fix crash that occurs when saving the machine state

2024-05-17 Thread Philippe Mathieu-Daudé


On 17/5/24 08:15, Thomas Huth wrote:

adapter_info_so_needed() treats its "opaque" parameter as a S390FLICState,
but the function belongs to a VMStateDescription that is attached to a
TYPE_VIRTIO_CCW_BUS device. This is currently causing a crash when the
user tries to save or migrate the VM state. Fix it by using s390_get_flic()
to get the correct device here instead.

Reported-by: Marc Hartmayer 
Fixes: 9d1b0f5bf5 ("s390_flic: add migration-enabled property")
Signed-off-by: Thomas Huth 
---
  hw/intc/s390_flic.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)


Reviewed-by: Philippe Mathieu-Daudé

Re: [PATCH 1/1] scsi-bus: Remove unused parameter state from scsi_dma_restart_cb

2024-05-17 Thread Philippe Mathieu-Daudé


Hi Ray,

On 17/5/24 09:14, Ray Lee wrote:

Signed-off-by: Ray Lee 
---
  hw/scsi/scsi-bus.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 9e40b0c920..7c3df9b31a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -255,7 +255,7 @@ static void scsi_dma_restart_req(SCSIRequest *req, void 
*opaque)
  scsi_req_unref(req);
  }
  
-static void scsi_dma_restart_cb(void *opaque, bool running, RunState state)

+static void scsi_dma_restart_cb(void *opaque, bool running)
  {
  SCSIDevice *s = opaque;
  


scsi_dma_restart_cb() is registered as callback:

 dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
  scsi_dma_restart_cb,
  dev);

The function prototype is (see include/sysemu/runstate.h):

  VMChangeStateEntry *
  qdev_add_vm_change_state_handler(DeviceState *dev,
   VMChangeStateHandler *cb,
   void *opaque);

and VMChangeStateHandler is defined as:

  typedef void VMChangeStateHandler(void *opaque,
bool running,
RunState state);

So even if the callback argument is not used, its prototype must
respect the VMChangeStateHandler definition. Thus your patch is
not correct. Indeed when building QEMU with your patch I get:

[152/339] Compiling C object libcommon.fa.p/hw_scsi_scsi-bus.c.o
../../hw/scsi/scsi-bus.c:359:13: error: incompatible function pointer 
types passing 'void (void *, bool)' to parameter of type 
'VMChangeStateHandler *' (aka 'void (*)(void *, bool, enum RunState)') 
[-Wincompatible-function-pointer-types]

scsi_dma_restart_cb, dev);
^~~
include/sysemu/runstate.h:24:76: note: passing argument to parameter 
'cb' here


VMChangeStateHandler *cb,

   ^
1 error generated.

Please test your patch :)

Regards,

Phil.

Re: [PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-17 Thread CLEMENT MATHIEU--DRIF

Hi Zhenzhong

On 17/05/2024 12:23, Zhenzhong Duan wrote:
> Caution: External email. Do not open attachments or click links, unless this 
> email comes from a known sender and you know the content is safe.
>
>
> From: Yu Zhang 
>
> Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
> Update with more detailed fault reasons listed in VT-d spec 7.2.3.
>
> Signed-off-by: Yu Zhang 
> Signed-off-by: Zhenzhong Duan 
> ---
>   hw/i386/intel_iommu_internal.h |  8 +++-
>   hw/i386/intel_iommu.c  | 25 -
>   2 files changed, 23 insertions(+), 10 deletions(-)
>
> diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
> index f8cf99bddf..666e2cf2ce 100644
> --- a/hw/i386/intel_iommu_internal.h
> +++ b/hw/i386/intel_iommu_internal.h
> @@ -311,7 +311,13 @@ typedef enum VTDFaultReason {
> * request while disabled */
>   VTD_FR_IR_SID_ERR = 0x26,   /* Invalid Source-ID */
>
> -VTD_FR_PASID_TABLE_INV = 0x58,  /*Invalid PASID table entry */
> +/* PASID directory entry access failure */
> +VTD_FR_PASID_DIR_ACCESS_ERR = 0x50,
> +/* The Present(P) field of pasid directory entry is 0 */
> +VTD_FR_PASID_DIR_ENTRY_P = 0x51,
> +VTD_FR_PASID_TABLE_ACCESS_ERR = 0x58, /* PASID table entry access 
> failure */
> +VTD_FR_PASID_ENTRY_P = 0x59, /* The Present(P) field of pasidt-entry is 
> 0 */
s/pasidt/pasid
> +VTD_FR_PASID_TABLE_ENTRY_INV = 0x5b,  /*Invalid PASID table entry */
>
>   /* Output address in the interrupt address range for scalable mode */
>   VTD_FR_SM_INTERRUPT_ADDR = 0x87,
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index cc8e59674e..0951ebb71d 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -771,7 +771,7 @@ static int vtd_get_pdire_from_pdir_table(dma_addr_t 
> pasid_dir_base,
>   addr = pasid_dir_base + index * entry_size;
>   if (dma_memory_read(_space_memory, addr,
>   pdire, entry_size, MEMTXATTRS_UNSPECIFIED)) {
> -return -VTD_FR_PASID_TABLE_INV;
> +return -VTD_FR_PASID_DIR_ACCESS_ERR;
>   }
>
>   pdire->val = le64_to_cpu(pdire->val);
> @@ -789,6 +789,7 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState 
> *s,
> dma_addr_t addr,
> VTDPASIDEntry *pe)
>   {
> +uint8_t pgtt;
>   uint32_t index;
>   dma_addr_t entry_size;
>   X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
> @@ -798,7 +799,7 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState 
> *s,
>   addr = addr + index * entry_size;
>   if (dma_memory_read(_space_memory, addr,
>   pe, entry_size, MEMTXATTRS_UNSPECIFIED)) {
> -return -VTD_FR_PASID_TABLE_INV;
> +return -VTD_FR_PASID_TABLE_ACCESS_ERR;
>   }
>   for (size_t i = 0; i < ARRAY_SIZE(pe->val); i++) {
>   pe->val[i] = le64_to_cpu(pe->val[i]);
> @@ -806,11 +807,13 @@ static int 
> vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState *s,
>
>   /* Do translation type check */
>   if (!vtd_pe_type_check(x86_iommu, pe)) {
> -return -VTD_FR_PASID_TABLE_INV;
> +return -VTD_FR_PASID_TABLE_ENTRY_INV;
>   }
>
> -if (!vtd_is_level_supported(s, VTD_PE_GET_LEVEL(pe))) {
> -return -VTD_FR_PASID_TABLE_INV;
> +pgtt = VTD_PE_GET_TYPE(pe);
> +if (pgtt == VTD_SM_PASID_ENTRY_SLT &&
> +!vtd_is_level_supported(s, VTD_PE_GET_LEVEL(pe))) {
> +return -VTD_FR_PASID_TABLE_ENTRY_INV;
>   }
>
>   return 0;
> @@ -851,7 +854,7 @@ static int vtd_get_pe_from_pasid_table(IntelIOMMUState *s,
>   }
>
>   if (!vtd_pdire_present()) {
> -return -VTD_FR_PASID_TABLE_INV;
> +return -VTD_FR_PASID_DIR_ENTRY_P;
>   }
>
>   ret = vtd_get_pe_from_pdire(s, pasid, , pe);
> @@ -860,7 +863,7 @@ static int vtd_get_pe_from_pasid_table(IntelIOMMUState *s,
>   }
>
>   if (!vtd_pe_present(pe)) {
> -return -VTD_FR_PASID_TABLE_INV;
> +return -VTD_FR_PASID_ENTRY_P;
>   }
>
>   return 0;
> @@ -913,7 +916,7 @@ static int vtd_ce_get_pasid_fpd(IntelIOMMUState *s,
>   }
>
>   if (!vtd_pdire_present()) {
> -return -VTD_FR_PASID_TABLE_INV;
> +return -VTD_FR_PASID_DIR_ENTRY_P;
>   }
>
>   /*
> @@ -1770,7 +1773,11 @@ static const bool vtd_qualified_faults[] = {
>   [VTD_FR_ROOT_ENTRY_RSVD] = false,
>   [VTD_FR_PAGING_ENTRY_RSVD] = true,
>   [VTD_FR_CONTEXT_ENTRY_TT] = true,
> -[VTD_FR_PASID_TABLE_INV] = false,
> +[VTD_FR_PASID_DIR_ACCESS_ERR] = false,
> +[VTD_FR_PASID_DIR_ENTRY_P] = true,
> +[VTD_FR_PASID_TABLE_ACCESS_ERR] = false,
> +[VTD_FR_PASID_ENTRY_P] = true,
> +[VTD_FR_PASID_TABLE_ENTRY_INV] = true,
>   [VTD_FR_SM_INTERRUPT_ADDR] = true,
>   [VTD_FR_MAX] = false,
>   };
> --
> 2.34.1
>
>
lgtm

Re: [PATCH-for-9.1 v2 2/3] migration: Remove RDMA protocol handling

2024-05-17 Thread Yu Zhang

Hello Michael and Peter,

Exactly, not so compelling, as I did it first only on servers widely
used for production in our data center. The network adapters are

Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720
2-port Gigabit Ethernet PCIe
InfiniBand controller: Mellanox Technologies MT27800 Family [ConnectX-5]

which doesn't meet our purpose. I can choose RDMA or TCP for VM
migration. RDMA traffic is through InfiniBand and TCP through Ethernet
on these two hosts. One is standby while the other is active.

Now I'll try on a server with more recent Ethernet and InfiniBand
network adapters. One of them has:
BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)

The comparison between RDMA and TCP on the same NIC could make more sense.

Best regards,
Yu Zhang @ IONOS Cloud







On Thu, May 16, 2024 at 7:30 PM Michael Galaxy  wrote:
>
> These are very compelling results, no?
>
> (40gbps cards, right? Are the cards active/active? or active/standby?)
>
> - Michael
>
> On 5/14/24 10:19, Yu Zhang wrote:
> > Hello Peter and all,
> >
> > I did a comparison of the VM live-migration speeds between RDMA and
> > TCP/IP on our servers
> > and plotted the results to get an initial impression. Unfortunately,
> > the Ethernet NICs are not the
> > recent ones, therefore, it may not make much sense. I can do it on
> > servers with more recent Ethernet
> > NICs and keep you updated.
> >
> > It seems that the benefits of RDMA becomes obviously when the VM has
> > large memory and is
> > running memory-intensive workload.
> >
> > Best regards,
> > Yu Zhang @ IONOS Cloud
> >
> > On Thu, May 9, 2024 at 4:14 PM Peter Xu  wrote:
> >> On Thu, May 09, 2024 at 04:58:34PM +0800, Zheng Chuan via wrote:
> >>> That's a good news to see the socket abstraction for RDMA!
> >>> When I was developed the series above, the most pain is the RDMA 
> >>> migration has no QIOChannel abstraction and i need to take a 'fake 
> >>> channel'
> >>> for it which is awkward in code implementation.
> >>> So, as far as I know, we can do this by
> >>> i. the first thing is that we need to evaluate the rsocket is good enough 
> >>> to satisfy our QIOChannel fundamental abstraction
> >>> ii. if it works right, then we will continue to see if it can give us 
> >>> opportunity to hide the detail of rdma protocol
> >>>  into rsocket by remove most of code in rdma.c and also some hack in 
> >>> migration main process.
> >>> iii. implement the advanced features like multi-fd and multi-uri for rdma 
> >>> migration.
> >>>
> >>> Since I am not familiar with rsocket, I need some times to look at it and 
> >>> do some quick verify with rdma migration based on rsocket.
> >>> But, yes, I am willing to involved in this refactor work and to see if we 
> >>> can make this migration feature more better:）
> >> Based on what we have now, it looks like we'd better halt the deprecation
> >> process a bit, so I think we shouldn't need to rush it at least in 9.1
> >> then, and we'll need to see how it goes on the refactoring.
> >>
> >> It'll be perfect if rsocket works, otherwise supporting multifd with little
> >> overhead / exported APIs would also be a good thing in general with
> >> whatever approach.  And obviously all based on the facts that we can get
> >> resources from companies to support this feature first.
> >>
> >> Note that so far nobody yet compared with rdma v.s. nic perf, so I hope if
> >> any of us can provide some test results please do so.  Many people are
> >> saying RDMA is better, but I yet didn't see any numbers comparing it with
> >> modern TCP networks.  I don't want to have old impressions floating around
> >> even if things might have changed..  When we have consolidated results, we
> >> should share them out and also reflect that in QEMU's migration docs when a
> >> rdma document page is ready.
> >>
> >> Chuan, please check the whole thread discussion, it may help to understand
> >> what we are looking for on rdma migrations [1].  Meanwhile please feel free
> >> to sync with Jinpu's team and see how to move forward with such a project.
> >>
> >> [1] 
> >> https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/87frwatp7n@suse.de/__;!!GjvTz_vk!QnXDo1zSlYecz7JvJky4SOQ9I8V5MoGHbINdAQAzMJQ_yYg_8_BSUXz9kjvbSgFefhG0wi1j38KaC3g$
> >>
> >> Thanks,
> >>
> >> --
> >> Peter Xu
> >>

[PATCH 1/1] scsi-bus: Remove unused parameter state from scsi_dma_restart_cb

2024-05-17 Thread Ray Lee

Signed-off-by: Ray Lee 
---
 hw/scsi/scsi-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c
index 9e40b0c920..7c3df9b31a 100644
--- a/hw/scsi/scsi-bus.c
+++ b/hw/scsi/scsi-bus.c
@@ -255,7 +255,7 @@ static void scsi_dma_restart_req(SCSIRequest *req, void 
*opaque)
 scsi_req_unref(req);
 }
 
-static void scsi_dma_restart_cb(void *opaque, bool running, RunState state)
+static void scsi_dma_restart_cb(void *opaque, bool running)
 {
 SCSIDevice *s = opaque;
 
-- 
2.39.3

Re: [PATCH v2 6/8] target/ppc: Move div/mod fixed-point insns (64 bits operands) to decodetree.

2024-05-17 Thread Nicholas Piggin

On Tue Apr 23, 2024 at 4:32 PM AEST, Chinmay Rath wrote:
> Moving the below instructions to decodetree specification :
>
>   divd[u, e, eu][o][.]: XO-form
>   mod{sd, ud} : X-form
>
> With this patch, all the fixed-point arithmetic instructions have been
> moved to decodetree.
> The changes were verified by validating that the tcg ops generated by those
> instructions remain the same, which were captured using the '-d in_asm,op' 
> flag.
> Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it 
> is
> now used to divide doubleword operands as well, and not just words.
>
> Signed-off-by: Chinmay Rath 
> Reviewed-by: Richard Henderson 

[...]

> +static bool do_divd(DisasContext *ctx, arg_XO *a, bool sign)
> +{
> +gen_op_arith_divd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb],
> +  sign, a->oe, a->rc);
> +return true;
> +}
> +
> +static bool do_modd(DisasContext *ctx, arg_X *a, bool sign)
> +{
> +REQUIRE_INSNS_FLAGS2(ctx, ISA300);
> +gen_op_arith_modd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb],
> +  sign);
> +return true;
> +}
> +
> +TRANS64(DIVD, do_divd, true);
> +TRANS64(DIVDU, do_divd, false);
> +TRANS64(DIVDE, do_dive, gen_helper_DIVDE);
> +TRANS64(DIVDEU, do_dive, gen_helper_DIVDEU);
> +
> +TRANS64(MODSD, do_modd, true);
> +TRANS64(MODUD, do_modd, false);

Sigh. I'm having to fix a bunch of these for 32-bit builds. Just
doing the #ifdef TARGET_PPC64 ... #else qemu_build_not_reached();
thing.

Which is quite ugly and actually prevents using some of these
macros and requires open coding (e.g., because DIVDE helper is
not declared for 32-bit in this case).

Maybe we should move 64-bit only instructions into their own
.decode file and not build them for 32-bit, so we don't have
to add all these dummy translate functions for them.

For now I'll try to squash in the fixes.

Thanks,
Nick

[PULL 6/6] hw/intc/s390_flic: Fix crash that occurs when saving the machine state

2024-05-17 Thread Thomas Huth

adapter_info_so_needed() treats its "opaque" parameter as a S390FLICState,
but the function belongs to a VMStateDescription that is attached to a
TYPE_VIRTIO_CCW_BUS device. This is currently causing a crash when the
user tries to save or migrate the VM state. Fix it by using s390_get_flic()
to get the correct device here instead.

Reported-by: Marc Hartmayer 
Fixes: 9d1b0f5bf5 ("s390_flic: add migration-enabled property")
Message-ID: <20240517061553.564529-1-th...@redhat.com>
Reviewed-by: Cédric Le Goater 
Tested-by: Marc Hartmayer 
Signed-off-by: Thomas Huth 
---
 hw/intc/s390_flic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
index 7f93080087..6771645699 100644
--- a/hw/intc/s390_flic.c
+++ b/hw/intc/s390_flic.c
@@ -459,7 +459,7 @@ type_init(qemu_s390_flic_register_types)
 
 static bool adapter_info_so_needed(void *opaque)
 {
-S390FLICState *fs = S390_FLIC_COMMON(opaque);
+S390FLICState *fs = s390_get_flic();
 
 return fs->migration_enabled;
 }
-- 
2.45.0

[PULL 4/6] tests/lcitool/projects/qemu.yml: Sort entries alphabetically again

2024-05-17 Thread Thomas Huth

Let's try to keep the entries in alphabetical order here!

Message-ID: <20240516084059.511463-5-th...@redhat.com>
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Thomas Huth 
---
 tests/lcitool/projects/qemu.yml | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tests/lcitool/projects/qemu.yml b/tests/lcitool/projects/qemu.yml
index b63b6bd850..7511ec7ccb 100644
--- a/tests/lcitool/projects/qemu.yml
+++ b/tests/lcitool/projects/qemu.yml
@@ -35,8 +35,8 @@ packages:
  - hostname
  - json-c
  - libaio
- - libattr
  - libasan
+ - libattr
  - libbpf
  - libc-static
  - libcacard
@@ -54,6 +54,7 @@ packages:
  - libjpeg
  - libnfs
  - libnuma
+ - libpipewire-dev
  - libpmem
  - libpng
  - librbd
@@ -73,27 +74,26 @@ packages:
  - llvm
  - lttng-ust
  - lzo
+ - make
+ - mesa-libgbm
+ - meson
  - mtools
+ - ncursesw
  - netcat
  - nettle
  - ninja
  - nsis
- - make
- - mesa-libgbm
- - meson
- - ncursesw
  - pam
  - pcre-static
  - pixman
- - libpipewire-dev
  - pkg-config
  - pulseaudio
  - python3
- - python3-PyYAML
  - python3-numpy
  - python3-opencv
  - python3-pillow
  - python3-pip
+ - python3-PyYAML
  - python3-sphinx
  - python3-sphinx-rtd-theme
  - python3-sqlite3
@@ -121,6 +121,6 @@ packages:
  - which
  - xen
  - xorriso
- - zstdtools
  - zlib
  - zlib-static
+ - zstdtools
-- 
2.45.0

[PULL 2/6] tests/lcitool: Remove 'xfsprogs' from QEMU

2024-05-17 Thread Thomas Huth

From: Philippe Mathieu-Daudé 

QEMU's commit a5730b8bd3 ("block/file-posix: Simplify the
XFS_IOC_DIOINFO handling") removed the need for the 'xfsprogs'
package.

Signed-off-by: Philippe Mathieu-Daudé 
[thuth: Adjusted the patch from the lcitools repo to QEMU's repo]
Message-ID: <20240516084059.511463-3-th...@redhat.com>
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Thomas Huth 
---
 tests/lcitool/projects/qemu.yml | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tests/lcitool/projects/qemu.yml b/tests/lcitool/projects/qemu.yml
index 149b15de57..9173d1e36e 100644
--- a/tests/lcitool/projects/qemu.yml
+++ b/tests/lcitool/projects/qemu.yml
@@ -121,7 +121,6 @@ packages:
  - vte
  - which
  - xen
- - xfsprogs
  - xorriso
  - zstdtools
  - zlib
-- 
2.45.0

[PULL 5/6] tests/docker/dockerfiles: Update container files with "lcitool-refresh"

2024-05-17 Thread Thomas Huth

Run "make lcitool-refresh" after the previous changes to the
lcitool files. This removes the g++ and xfslibs-dev packages
from the dockerfiles (except for the fedora-win64-cross dockerfile
where we keep the C++ compiler).

Message-ID: <20240516084059.511463-6-th...@redhat.com>
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Thomas Huth 
---
 tests/docker/dockerfiles/alpine.docker| 4 
 tests/docker/dockerfiles/centos9.docker   | 4 
 tests/docker/dockerfiles/debian-amd64-cross.docker| 4 
 tests/docker/dockerfiles/debian-arm64-cross.docker| 4 
 tests/docker/dockerfiles/debian-armel-cross.docker| 4 
 tests/docker/dockerfiles/debian-armhf-cross.docker| 4 
 tests/docker/dockerfiles/debian-i686-cross.docker | 4 
 tests/docker/dockerfiles/debian-mips64el-cross.docker | 4 
 tests/docker/dockerfiles/debian-mipsel-cross.docker   | 4 
 tests/docker/dockerfiles/debian-ppc64el-cross.docker  | 4 
 tests/docker/dockerfiles/debian-riscv64-cross.docker  | 3 ---
 tests/docker/dockerfiles/debian-s390x-cross.docker| 4 
 tests/docker/dockerfiles/debian.docker| 4 
 tests/docker/dockerfiles/fedora-win64-cross.docker| 2 +-
 tests/docker/dockerfiles/fedora.docker| 4 
 tests/docker/dockerfiles/opensuse-leap.docker | 4 
 tests/docker/dockerfiles/ubuntu2204.docker| 4 
 17 files changed, 1 insertion(+), 64 deletions(-)

diff --git a/tests/docker/dockerfiles/alpine.docker 
b/tests/docker/dockerfiles/alpine.docker
index cd9d7af1ce..554464f31e 100644
--- a/tests/docker/dockerfiles/alpine.docker
+++ b/tests/docker/dockerfiles/alpine.docker
@@ -32,7 +32,6 @@ RUN apk update && \
 findutils \
 flex \
 fuse3-dev \
-g++ \
 gcc \
 gcovr \
 gettext \
@@ -110,7 +109,6 @@ RUN apk update && \
 vte3-dev \
 which \
 xen-dev \
-xfsprogs-dev \
 xorriso \
 zlib-dev \
 zlib-static \
@@ -119,10 +117,8 @@ RUN apk update && \
 rm -f /usr/lib*/python3*/EXTERNALLY-MANAGED && \
 apk list --installed | sort > /packages.txt && \
 mkdir -p /usr/libexec/ccache-wrappers && \
-ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/c++ && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/cc && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/clang && \
-ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/g++ && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/gcc
 
 ENV CCACHE_WRAPPERSDIR "/usr/libexec/ccache-wrappers"
diff --git a/tests/docker/dockerfiles/centos9.docker 
b/tests/docker/dockerfiles/centos9.docker
index 6cf47ce786..0256865b9e 100644
--- a/tests/docker/dockerfiles/centos9.docker
+++ b/tests/docker/dockerfiles/centos9.docker
@@ -34,7 +34,6 @@ RUN dnf distro-sync -y && \
 flex \
 fuse3-devel \
 gcc \
-gcc-c++ \
 gettext \
 git \
 glib2-devel \
@@ -115,7 +114,6 @@ RUN dnf distro-sync -y && \
 util-linux \
 vte291-devel \
 which \
-xfsprogs-devel \
 xorriso \
 zlib-devel \
 zlib-static \
@@ -125,10 +123,8 @@ RUN dnf distro-sync -y && \
 rm -f /usr/lib*/python3*/EXTERNALLY-MANAGED && \
 rpm -qa | sort > /packages.txt && \
 mkdir -p /usr/libexec/ccache-wrappers && \
-ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/c++ && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/cc && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/clang && \
-ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/g++ && \
 ln -s /usr/bin/ccache /usr/libexec/ccache-wrappers/gcc
 
 ENV CCACHE_WRAPPERSDIR "/usr/libexec/ccache-wrappers"
diff --git a/tests/docker/dockerfiles/debian-amd64-cross.docker 
b/tests/docker/dockerfiles/debian-amd64-cross.docker
index d0b0e9778e..f8c61d1191 100644
--- a/tests/docker/dockerfiles/debian-amd64-cross.docker
+++ b/tests/docker/dockerfiles/debian-amd64-cross.docker
@@ -79,7 +79,6 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
 eatmydata apt-get dist-upgrade -y && \
 eatmydata apt-get install --no-install-recommends -y dpkg-dev && \
 eatmydata apt-get install --no-install-recommends -y \
-  g++-x86-64-linux-gnu \
   gcc-x86-64-linux-gnu \
   libaio-dev:amd64 \
   libasan6:amd64 \
@@ -149,7 +148,6 @@ RUN export DEBIAN_FRONTEND=noninteractive && \
   libzstd-dev:amd64 \
   nettle-dev:amd64 \
   systemtap-sdt-dev:amd64 \
-  xfslibs-dev:amd64 \
   zlib1g-dev:amd64 && \
 eatmydata apt-get autoremove -y && \
 eatmydata apt-get autoclean -y && \
@@ -167,9 +165,7 @@ cpu = 'x86_64'\n\
 endian = 'little'\n" > /usr/local/share/meson/cross/x86_64-linux-gnu && \
 dpkg-query --showformat

[PULL 1/6] tests/lcitool/refresh: Treat the output of lcitool as text, not as bytes

2024-05-17 Thread Thomas Huth

In case lcitool fails (e.g. with a python backtrace), this makes
the output  of lcitool much more readable.

Suggested-by: Daniel P. Berrangé 
Message-ID: <20240516084059.511463-2-th...@redhat.com>
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Thomas Huth 
---
 tests/lcitool/refresh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/lcitool/refresh b/tests/lcitool/refresh
index 24a735a3f2..174818d9c9 100755
--- a/tests/lcitool/refresh
+++ b/tests/lcitool/refresh
@@ -43,12 +43,12 @@ def atomic_write(filename, content):
 
 def generate(filename, cmd, trailer):
 print("Generate %s" % filename)
-lcitool = subprocess.run(cmd, capture_output=True)
+lcitool = subprocess.run(cmd, capture_output=True, encoding='utf8')
 
 if lcitool.returncode != 0:
 raise Exception("Failed to generate %s: %s" % (filename, 
lcitool.stderr))
 
-content = lcitool.stdout.decode("utf8")
+content = lcitool.stdout
 if trailer is not None:
 content += trailer
 atomic_write(filename, content)
-- 
2.45.0

[PULL 0/6] Fix s390x crash and clean up container images

2024-05-17 Thread Thomas Huth

The following changes since commit 85ef20f1673feaa083f4acab8cf054df77b0dbed:

  Merge tag 'pull-maintainer-may24-160524-2' of https://gitlab.com/stsquad/qemu 
into staging (2024-05-16 10:02:56 +0200)

are available in the Git repository at:

  https://gitlab.com/thuth/qemu.git tags/pull-request-2024-05-17

for you to fetch changes up to bebe9603fcb072dcdb7fb22005781b3582a4d701:

  hw/intc/s390_flic: Fix crash that occurs when saving the machine state 
(2024-05-17 11:18:32 +0200)


* Fix s390x crash when doing migration / savevm
* Decrease size of CI containers by removing unnecessary packages


Philippe Mathieu-Daudé (1):
  tests/lcitool: Remove 'xfsprogs' from QEMU

Thomas Huth (5):
  tests/lcitool/refresh: Treat the output of lcitool as text, not as bytes
  tests/lcitool: Remove g++ from the containers (except for the MinGW one)
  tests/lcitool/projects/qemu.yml: Sort entries alphabetically again
  tests/docker/dockerfiles: Update container files with "lcitool-refresh"
  hw/intc/s390_flic: Fix crash that occurs when saving the machine state

 hw/intc/s390_flic.c   |  2 +-
 tests/docker/dockerfiles/alpine.docker|  4 
 tests/docker/dockerfiles/centos9.docker   |  4 
 tests/docker/dockerfiles/debian-amd64-cross.docker|  4 
 tests/docker/dockerfiles/debian-arm64-cross.docker|  4 
 tests/docker/dockerfiles/debian-armel-cross.docker|  4 
 tests/docker/dockerfiles/debian-armhf-cross.docker|  4 
 tests/docker/dockerfiles/debian-i686-cross.docker |  4 
 tests/docker/dockerfiles/debian-mips64el-cross.docker |  4 
 tests/docker/dockerfiles/debian-mipsel-cross.docker   |  4 
 tests/docker/dockerfiles/debian-ppc64el-cross.docker  |  4 
 tests/docker/dockerfiles/debian-riscv64-cross.docker  |  3 ---
 tests/docker/dockerfiles/debian-s390x-cross.docker|  4 
 tests/docker/dockerfiles/debian.docker|  4 
 tests/docker/dockerfiles/fedora-win64-cross.docker|  2 +-
 tests/docker/dockerfiles/fedora.docker|  4 
 tests/docker/dockerfiles/opensuse-leap.docker |  4 
 tests/docker/dockerfiles/ubuntu2204.docker|  4 
 tests/lcitool/projects/qemu-minimal.yml   |  1 -
 tests/lcitool/projects/qemu-win-installer.yml |  4 
 tests/lcitool/projects/qemu.yml   | 18 --
 tests/lcitool/refresh |  5 +++--
 22 files changed, 17 insertions(+), 78 deletions(-)
 create mode 100644 tests/lcitool/projects/qemu-win-installer.yml

[PULL 3/6] tests/lcitool: Remove g++ from the containers (except for the MinGW one)

2024-05-17 Thread Thomas Huth

We don't need C++ for the normal QEMU builds anymore, so installing
g++ in each and every container seems to be a waste of time and disk
space. The only container that still needs it is the Fedora MinGW
container that builds the only remaining C++ code in ./qga/vss-win32/
and we can install it there with an extra project yml file instead.

Message-ID: <20240516084059.511463-4-th...@redhat.com>
Reviewed-by: Daniel P. Berrangé 
Signed-off-by: Thomas Huth 
---
 tests/lcitool/projects/qemu-minimal.yml   | 1 -
 tests/lcitool/projects/qemu-win-installer.yml | 4 
 tests/lcitool/projects/qemu.yml   | 1 -
 tests/lcitool/refresh | 1 +
 4 files changed, 5 insertions(+), 2 deletions(-)
 create mode 100644 tests/lcitool/projects/qemu-win-installer.yml

diff --git a/tests/lcitool/projects/qemu-minimal.yml 
b/tests/lcitool/projects/qemu-minimal.yml
index d44737dc1d..6bc232a1c3 100644
--- a/tests/lcitool/projects/qemu-minimal.yml
+++ b/tests/lcitool/projects/qemu-minimal.yml
@@ -7,7 +7,6 @@ packages:
  - ccache
  - findutils
  - flex
- - g++
  - gcc
  - gcc-native
  - glib2
diff --git a/tests/lcitool/projects/qemu-win-installer.yml 
b/tests/lcitool/projects/qemu-win-installer.yml
new file mode 100644
index 00..86aa22297c
--- /dev/null
+++ b/tests/lcitool/projects/qemu-win-installer.yml
@@ -0,0 +1,4 @@
+# Additional packages that are required to build the code in qga/vss-win32/
+---
+packages:
+ - g++
diff --git a/tests/lcitool/projects/qemu.yml b/tests/lcitool/projects/qemu.yml
index 9173d1e36e..b63b6bd850 100644
--- a/tests/lcitool/projects/qemu.yml
+++ b/tests/lcitool/projects/qemu.yml
@@ -22,7 +22,6 @@ packages:
  - findutils
  - flex
  - fuse3
- - g++
  - gcc
  - gcc-native
  - gcovr
diff --git a/tests/lcitool/refresh b/tests/lcitool/refresh
index 174818d9c9..789acefb75 100755
--- a/tests/lcitool/refresh
+++ b/tests/lcitool/refresh
@@ -192,6 +192,7 @@ try:
 "s390x-softmmu,s390x-linux-user"))
 
 generate_dockerfile("fedora-win64-cross", "fedora-38",
+project='qemu,qemu-win-installer',
 cross="mingw64",
 trailer=cross_build("x86_64-w64-mingw32-",
 "x86_64-softmmu"))
-- 
2.45.0

Re: [PATCH v7 00/12] Enabling DCD emulation support in Qemu

2024-05-17 Thread Jonathan Cameron via

On Thu, 16 May 2024 10:05:33 -0700
fan  wrote:

> On Fri, Apr 19, 2024 at 02:24:36PM -0400, Gregory Price wrote:
> > On Thu, Apr 18, 2024 at 04:10:51PM -0700, nifan@gmail.com wrote:  
> > > A git tree of this series can be found here (with one extra commit on top
> > > for printing out accepted/pending extent list): 
> > > https://github.com/moking/qemu/tree/dcd-v7
> > > 
> > > v6->v7:
> > > 
> > > 1. Fixed the dvsec range register issue mentioned in the the cover letter 
> > > in v6.
> > >Only relevant bits are set to mark the device ready (Patch 6). 
> > > (Jonathan)
> > > 2. Moved the if statement in cxl_setup_memory from Patch 6 to Patch 4. 
> > > (Jonathan)
> > > 3. Used MIN instead of if statement to get record_count in Patch 7. 
> > > (Jonathan)
> > > 4. Added "Reviewed-by" tag to Patch 7.
> > > 5. Modified cxl_dc_extent_release_dry_run so the updated extent list can 
> > > be
> > >reused in cmd_dcd_release_dyn_cap to simplify the process in Patch 8. 
> > > (Jørgen) 
> > > 6. Added comments to indicate further "TODO" items in 
> > > cmd_dcd_add_dyn_cap_rsp.
> > > (Jonathan)
> > > 7. Avoided irrelevant code reformat in Patch 8. (Jonathan)
> > > 8. Modified QMP interfaces for adding/releasing DC extents to allow 
> > > passing
> > >tags, selection policy, flags in the interface. (Jonathan, Gregory)
> > > 9. Redesigned the pending list so extents in the same requests are grouped
> > > together. A new data structure is introduced to represent "extent 
> > > group"
> > > in pending list.  (Jonathan)
> > > 10. Added support in QMP interface for "More" flag. 
> > > 11. Check "Forced removal" flag for release request and not let it pass 
> > > through.
> > > 12. Removed the dynamic capacity log type from CxlEventLog definition in 
> > > cxl.json
> > >to avoid the side effect it may introduce to inject error to DC event 
> > > log.
> > >(Jonathan)
> > > 13. Hard coded the event log type to dynamic capacity event log in QMP
> > > interfaces. (Jonathan)
> > > 14. Adding space in between "-1]". (Jonathan)
> > > 15. Some minor comment fixes.
> > > 
> > > The code is tested with similar setup and has passed similar tests as 
> > > listed
> > > in the cover letter of v5[1] and v6[2].
> > > Also, the code is tested with the latest DCD kernel patchset[3].
> > > 
> > > [1] Qemu DCD patchset v5: 
> > > https://lore.kernel.org/linux-cxl/20240304194331.1586191-1-nifan@gmail.com/T/#t
> > > [2] Qemu DCD patchset v6: 
> > > https://lore.kernel.org/linux-cxl/20240325190339.696686-1-nifan@gmail.com/T/#t
> > > [3] DCD kernel patches: 
> > > https://lore.kernel.org/linux-cxl/20240324-dcd-type2-upstream-v1-0-b7b00d623...@intel.com/T/#m11c571e21c4fe17c7d04ec5c2c7bc7cbf2cd07e3
> > >  
> > 
> > added review to all patches, will hopefully be able to add a Tested-by
> > tag early next week, along with a v1 RFC for MHD bit-tracking.
> > 
> > We've been testing v5/v6 for a bit, so I expect as soon as we get the
> > MHD code ported over to v7 i'll ship a tested-by tag pretty quick.
> > 
> > The super-set release will complicate a few things but this doesn't
> > look like a blocker on our end, just a change to how we track bits in a
> > shared bit/bytemap.
> >   
> 
> Hi Gregory,
> I am planning to address all the concerns in this series and send out v8
> next week. Jonathan mentioned you have few related patches built on top
> of this series, can you point me to the latest version so I can look
> into it? Also, would you like me to carry them over to send together
> with my series in next version? It could be easier for you to avoid the
> potential rebase needed for your patches?

I wasn't clear - I meant other way around.
This series is built on a couple of Gregory's patches.  Gregory can suffer
the pain of rebasing his stuff ;) (or I'll do it depending on when things
land).

hw/cxl/mailbox: change CCI cmd set structure to be a member, not a reference 
https://gitlab.com/jic23/qemu/-/commit/f44ebc5a455ccdd6535879b0c5824e0d76b04da5
hw/cxl/mailbox: interface to add CCI commands to an existing CCI 
https://gitlab.com/jic23/qemu/-/commit/00a4dd8b388add03c588298f665ee918626296a5

I was suggesting your next posting should just include those two with
your sign-off added. That way if everyone is happy with v8 Michael Tsirkin
can pick it up directly, saving a step.

Make sure to add Michael to the to list as well for next version.

Thanks,

Jonathan

> 
> Let me know.
> 
> Thanks,
> Fan
> 
> > > 
> > > Fan Ni (12):
> > >   hw/cxl/cxl-mailbox-utils: Add dc_event_log_size field to output
> > > payload of identify memory device command
> > >   hw/cxl/cxl-mailbox-utils: Add dynamic capacity region representative
> > > and mailbox command support
> > >   include/hw/cxl/cxl_device: Rename mem_size as static_mem_size for
> > > type3 memory devices
> > >   hw/mem/cxl_type3: Add support to create DC regions to type3 memory
> > > devices
> > >   hw/mem/cxl-type3: Refactor

Re: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM

2024-05-17 Thread CLEMENT MATHIEU--DRIF


On 17/05/2024 12:44, Duan, Zhenzhong wrote:
> Caution: External email. Do not open attachments or click links, unless this 
> email comes from a known sender and you know the content is safe.
>
>
>> -Original Message-
>> From: CLEMENT MATHIEU--DRIF 
>> Subject: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe
>> devices that support SVM
>>
>> As the SVM-capable devices will need to cache translations, we provide
>> an first implementation.
>>
>> This cache uses a two-level design based on hash tables.
>> The first level is indexed by a PASID and the second by a virtual addresse.
>>
>> Signed-off-by: Clément Mathieu--Drif 
>> ---
>> tests/unit/meson.build |   1 +
>> tests/unit/test-atc.c  | 502
>> +
>> util/atc.c | 211 +
>> util/atc.h | 117 ++
>> util/meson.build   |   1 +
>> 5 files changed, 832 insertions(+)
>> create mode 100644 tests/unit/test-atc.c
>> create mode 100644 util/atc.c
>> create mode 100644 util/atc.h
> Maybe the unit test can be split from functional change?
will do!
>> diff --git a/tests/unit/meson.build b/tests/unit/meson.build
>> index 228a21d03c..5c9a6fe9f4 100644
>> --- a/tests/unit/meson.build
>> +++ b/tests/unit/meson.build
>> @@ -52,6 +52,7 @@ tests = {
>>'test-interval-tree': [],
>>'test-xs-node': [qom],
>>'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-
>> dmabuf.c'],
>> +  'test-atc': []
>> }
>>
>> if have_system or have_tools
>> diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
>> new file mode 100644
>> index 00..60fa60924a
>> --- /dev/null
>> +++ b/tests/unit/test-atc.c
>> @@ -0,0 +1,502 @@
>> +/*
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; either version 2 of the License, or
>> + * (at your option) any later version.
>> +
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> +
>> + * You should have received a copy of the GNU General Public License along
>> + * with this program; if not, see .
>> + */
>> +
>> +#include "util/atc.h"
>> +
>> +static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry
>> *e2)
>> +{
>> +if (!e1 || !e2) {
>> +return !e1 && !e2;
>> +}
>> +return e1->iova == e2->iova &&
>> +e1->addr_mask == e2->addr_mask &&
>> +e1->pasid == e2->pasid &&
>> +e1->perm == e2->perm &&
>> +e1->target_as == e2->target_as &&
>> +e1->translated_addr == e2->translated_addr;
>> +}
>> +
>> +static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
>> + uint32_t pasid, hwaddr iova)
>> +{
>> +IOMMUTLBEntry *result;
>> +result = atc_lookup(atc, pasid, iova);
>> +g_assert(tlb_entry_equal(result, target));
>> +}
>> +
>> +static void check_creation(uint64_t page_size, uint8_t address_width,
>> +   uint8_t levels, uint8_t level_offset,
>> +   bool should_work) {
>> +ATC *atc = atc_new(page_size, address_width);
>> +if (atc) {
>> +if (atc->levels != levels || atc->level_offset != level_offset) {
>> +g_assert(false); /* ATC created but invalid configuration : 
>> fail */
>> +}
>> +atc_destroy(atc);
>> +g_assert(should_work);
>> +} else {
>> +g_assert(!should_work);
>> +}
>> +}
>> +
>> +static void test_creation_parameters(void)
>> +{
>> +check_creation(8, 39, 3, 9, false);
>> +check_creation(4095, 39, 3, 9, false);
>> +check_creation(4097, 39, 3, 9, false);
>> +check_creation(8192, 48, 0, 0, false);
>> +
>> +check_creation(4096, 38, 0, 0, false);
>> +check_creation(4096, 39, 3, 9, true);
>> +check_creation(4096, 40, 0, 0, false);
>> +check_creation(4096, 47, 0, 0, false);
>> +check_creation(4096, 48, 4, 9, true);
>> +check_creation(4096, 49, 0, 0, false);
>> +check_creation(4096, 56, 0, 0, false);
>> +check_creation(4096, 57, 5, 9, true);
>> +check_creation(4096, 58, 0, 0, false);
>> +
>> +check_creation(16384, 35, 0, 0, false);
>> +check_creation(16384, 36, 2, 11, true);
>> +check_creation(16384, 37, 0, 0, false);
>> +check_creation(16384, 46, 0, 0, false);
>> +check_creation(16384, 47, 3, 11, true);
>> +check_creation(16384, 48, 0, 0, false);
>> +check_creation(16384, 57, 0, 0, false);
>> +check_creation(16384, 58, 4, 11, true);
>> +check_creation(16384, 59, 0, 0, false);
>> +}
>> +
>> +static void test_single_entry(void)
>> +{
>> +IOMMUTLBEntry entry = {
>> +.iova =

Re: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry

2024-05-17 Thread CLEMENT MATHIEU--DRIF


On 17/05/2024 12:40, Duan, Zhenzhong wrote:
> Caution: External email. Do not open attachments or click links, unless this 
> email comes from a known sender and you know the content is safe.
>
>
>> -Original Message-
>> From: CLEMENT MATHIEU--DRIF 
>> Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>> creating an instance of IOMMUTLBEntry
>>
>> Signed-off-by: Clément Mathieu--Drif 
>> ---
>> hw/i386/intel_iommu.c | 7 +++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> index 53f17d66c0..c4ebd4569e 100644
>> --- a/hw/i386/intel_iommu.c
>> +++ b/hw/i386/intel_iommu.c
>> @@ -2299,6 +2299,7 @@ out:
>>  entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>> page_mask;
>>  entry->addr_mask = ~page_mask;
>>  entry->perm = access_flags;
>> +entry->pasid = pasid;
> For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?
we have the following statement a few lines above :
if (rid2pasid) {
     pasid = VTD_CE_GET_RID2PASID();
}

so we store rid2pasid if the feature is enabled.

But maybe we should store PCI_NO_PASID because the rest of the world is 
not supposed to be aware of what we are doing with rid2pasid.

Does it look good to you?
>
> Thanks
> Zhenzhong
>
>>  return true;
>>
>> error:
>> @@ -2307,6 +2308,7 @@ error:
>>  entry->translated_addr = 0;
>>  entry->addr_mask = 0;
>>  entry->perm = IOMMU_NONE;
>> +entry->pasid = PCI_NO_PASID;
>>  return false;
>> }
>>
>> @@ -3497,6 +3499,7 @@ static void
>> vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
>>  event.entry.target_as = _space_memory;
>>  event.entry.iova = notifier->start;
>>  event.entry.perm = IOMMU_NONE;
>> +event.entry.pasid = pasid;
>>  event.entry.addr_mask = notifier->end - notifier->start;
>>  event.entry.translated_addr = 0;
>>
>> @@ -3678,6 +3681,7 @@ static void
>> vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
>>  event.entry.target_as = _space_memory;
>>  event.entry.iova = addr;
>>  event.entry.perm = IOMMU_NONE;
>> +event.entry.pasid = pasid;
>>  event.entry.addr_mask = size - 1;
>>  event.entry.translated_addr = 0;
>>
>> @@ -4335,6 +4339,7 @@ static void
>> do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
>>  event.entry.iova = addr;
>>  event.entry.perm = IOMMU_NONE;
>>  event.entry.translated_addr = 0;
>> +event.entry.pasid = vtd_dev_as->pasid;
>>  memory_region_notify_iommu(_dev_as->iommu, 0, event);
>> }
>>
>> @@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>  IOMMUTLBEntry iotlb = {
>>  /* We'll fill in the rest later. */
>>  .target_as = _space_memory,
>> +.pasid = vtd_as->pasid,
>>  };
>>  bool success;
>>
>> @@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>> vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
>>  iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
>>  iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
>>  iotlb.perm = IOMMU_RW;
>> +iotlb.pasid = PCI_NO_PASID;
>>  success = true;
>>  }
>>
>> --
>> 2.44.0

Re: [PATCH v2 1/3] docs: introduce dedicated page about code provenance / sign-off

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 16, 2024 at 01:33:01PM -0400, Michael S. Tsirkin wrote:
> On Thu, May 16, 2024 at 05:22:28PM +0100, Daniel P. Berrangé wrote:
> > Currently we have a short paragraph saying that patches must include
> > a Signed-off-by line, and merely link to the kernel documentation.
> > The linked kernel docs have a lot of content beyond the part about
> > sign-off an thus are misleading/distracting to QEMU contributors.
> > 
> > This introduces a dedicated 'code-provenance' page in QEMU talking
> > about why we require sign-off, explaining the other tags we commonly
> > use, and what to do in some edge cases.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  docs/devel/code-provenance.rst| 212 ++
> >  docs/devel/index-process.rst  |   1 +
> >  docs/devel/submitting-a-patch.rst |  19 +--
> >  3 files changed, 215 insertions(+), 17 deletions(-)
> >  create mode 100644 docs/devel/code-provenance.rst
> > 
> > diff --git a/docs/devel/code-provenance.rst b/docs/devel/code-provenance.rst
> > new file mode 100644
> > index 00..7c42fae571
> > --- /dev/null
> > +++ b/docs/devel/code-provenance.rst
> > @@ -0,0 +1,212 @@
> > +.. _code-provenance:
> > +
> > +Code provenance
> > +===
> > +
> > +Certifying patch submissions
> > +
> > +
> > +The QEMU community **mandates** all contributors to certify provenance of
> > +patch submissions they make to the project. To put it another way,
> > +contributors must indicate that they are legally permitted to contribute to
> > +the project.
> > +
> > +Certification is achieved with a low overhead by adding a single line to 
> > the
> > +bottom of every git commit::
> > +
> > +   Signed-off-by: YOUR NAME 
> > +
> > +The addition of this line asserts that the author of the patch is 
> > contributing
> > +in accordance with the clauses specified in the
> > +`Developer's Certificate of Origin `__:
> 
> Why are you linking to this one?

The kernel doesn't have a standalone copy of the text, it is just
inline in the middle of their huge SubmittingPatches document.
We don't want to mislead people into thinking we're following
the kernel's patch submision rules in general, instead define
our own clear policy. 


> It's slightly different from kernel, with copyright and prohibition to change 
> it.

That difference is not of any consequence. The probhition
aganist changing makes sense, to protect the value of the
"Developer Certificate of Origin" term to have a fixed
meaning.

The 4 clauses that you must certify against are all identical
to the kernel's copy, which is what matters.

> there's also a bit more text in the kernel, e.g. the rule against
> anonymous contributions.

Yes, we should clarify our intent in this respect, per the other
part of this thread around what we interpret "real name" to
mean for QEMU.


> > diff --git a/docs/devel/submitting-a-patch.rst 
> > b/docs/devel/submitting-a-patch.rst
> > index 83e9092b8c..2cc4d53ff6 100644
> > --- a/docs/devel/submitting-a-patch.rst
> > +++ b/docs/devel/submitting-a-patch.rst
> > @@ -322,23 +322,8 @@ Patch emails must include a ``Signed-off-by:`` line
> >  
> >  Your patches **must** include a Signed-off-by: line. This is a hard
> >  requirement because it's how you say "I'm legally okay to contribute
> > -this and happy for it to go into QEMU". The process is modelled after
> > -the `Linux kernel
> > -`__
> > -policy.
> > -
> > -If you wrote the patch, make sure your "From:" and "Signed-off-by:"
> > -lines use the same spelling. It's okay if you subscribe or contribute to
> > -the list via more than one address, but using multiple addresses in one
> > -commit just confuses things.
> 
> 
> I gather you no longer see value in discussing this use-case?
> Maybe mention in commit log, why.

I should have preserved this phrase in the new doc.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 3/3] docs: define policy forbidding use of AI code generators

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 16, 2024 at 01:11:26PM -0400, Michael S. Tsirkin wrote:
> On Thu, May 16, 2024 at 05:22:30PM +0100, Daniel P. Berrangé wrote:
> > There has been an explosion of interest in so called AI code generators
> > in the past year or two. Thus far though, this is has not been matched
> > by a broadly accepted legal interpretation of the licensing implications
> > for code generator outputs. While the vendors may claim there is no
> > problem and a free choice of license is possible, they have an inherent
> > conflict of interest in promoting this interpretation. More broadly
> > there is, as yet, no broad consensus on the licensing implications of
> > code generators trained on inputs under a wide variety of licenses
> > 
> > The DCO requires contributors to assert they have the right to
> > contribute under the designated project license. Given the lack of
> > consensus on the licensing of AI code generator output, it is not
> > considered credible to assert compliance with the DCO clause (b) or (c)
> > where a patch includes such generated code.
> > 
> > This patch thus defines a policy that the QEMU project will currently
> > not accept contributions where use of AI code generators is either
> > known, or suspected.
> > 
> > This merely reflects the current uncertainty of the field, and should
> > this situation change, the policy is of course subject to future
> > relaxation. Meanwhile requests for exceptions can also be considered on
> > a case by case basis.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  docs/devel/code-provenance.rst | 50 +-
> >  1 file changed, 49 insertions(+), 1 deletion(-)
> > 
> > diff --git a/docs/devel/code-provenance.rst b/docs/devel/code-provenance.rst
> > index eabb3e7c08..846dda9a35 100644
> > --- a/docs/devel/code-provenance.rst
> > +++ b/docs/devel/code-provenance.rst
> > @@ -264,4 +264,52 @@ boilerplate code template which is then filled in to 
> > produce the final patch.
> >  The output of such a tool would still be considered the "preferred format",
> >  since it is intended to be a foundation for further human authored changes.
> >  Such tools are acceptable to use, provided they follow a deterministic 
> > process
> > -and there is clearly defined copyright and licensing for their output.
> > +and there is clearly defined copyright and licensing for their output. Note
> > +in particular the caveats applying to AI code generators below.
> > +
> > +Use of AI code generators
> > +~
> > +
> > +TL;DR:
> > +
> > +  **Current QEMU project policy is to DECLINE any contributions which are
> > +  believed to include or derive from AI generated code. This includes 
> > ChatGPT,
> > +  CoPilot, Llama and similar tools**
> > +
> > +The increasing prevalence of AI code generators, most notably but not 
> > limited
> > +to, `Large Language Models 
> > `__
> > +(LLMs) results in a number of difficult legal questions and risks for 
> > software
> > +projects, including QEMU.
> > +
> > +The QEMU community requires that contributors certify their patch 
> > submissions
> > +are made in accordance with the rules of the :ref:`dco` (DCO).
> > +
> > +To satisfy the DCO, the patch contributor has to fully understand the
> > +copyright and license status of code they are contributing to QEMU. With AI
> > +code generators, the copyright and license status of the output is 
> > ill-defined
> > +with no generally accepted, settled legal foundation.
> > +
> > +Where the training material is known, it is common for it to include large
> > +volumes of material under restrictive licensing/copyright terms. Even where
> > +the training material is all known to be under open source licenses, it is
> > +likely to be under a variety of terms, not all of which will be compatible
> > +with QEMU's licensing requirements.
> > +
> > +With this in mind, the QEMU project does not consider it is currently 
> > possible
> > +for contributors to comply with DCO terms (b) or (c) for the output of 
> > commonly
> > +available AI code generators.
> > +
> > +The QEMU maintainers thus require that contributors refrain from using AI 
> > code
> > +generators on patches intended to be submitted to the project, and will
> > +decline any contribution if use of AI is either known or suspected.
> > +
> > +Examples of tools impacted by this policy includes both GitHub's CoPilot,
> > +OpenAI's ChatGPT, and Meta's Code Llama, amongst many others which are less
> > +well known.
> > +
> > +This policy may evolve as the legal situation is clarifed. In the 
> > meanwhile,
> > +requests for exceptions to this policy will be evaluated by the QEMU 
> > project
> > +on a case by case basis. To be granted an exception, a contributor will 
> > need
> > +to demonstrate clarity of the license and copyright status for the tool's
> > +output in relation to its training model and code, to the satisfaction of 
> > the

Re: [PATCH v2 2/3] docs: define policy limiting the inclusion of generated files

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 16, 2024 at 01:04:42PM -0400, Michael S. Tsirkin wrote:
> On Thu, May 16, 2024 at 05:22:29PM +0100, Daniel P. Berrangé wrote:
> > Files contributed to QEMU are generally expected to be provided in the
> > preferred format for manipulation. IOW, we generally don't expect to
> > have generated / compiled code included in the tree, rather, we expect
> > to run the code generator / compiler as part of the build process.
> > 
> > There are some obvious exceptions to this seen in our existing tree, the
> > biggest one being the inclusion of many binary firmware ROMs. A more
> > niche example is the inclusion of a generated eBPF program. Or the CI
> > dockerfiles which are mostly auto-generated. In these cases, however,
> > the preferred format source code is still required to be included,
> > alongside the generated output.
> > 
> > Tools which perform user defined algorithmic transformations on code are
> > not considered to be "code generators". ie, we permit use of coccinelle,
> > spell checkers, and sed/awk/etc to manipulate code. Such use of automated
> > manipulation should still be declared in the commit message.
> > 
> > One off generators which create a boilerplate file which the author then
> > fills in, are acceptable if their output has clear copyright and license
> > status. This could be where a contributor writes a throwaway python
> > script to automate creation of some mundane piece of code for example.
> > 
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >  docs/devel/code-provenance.rst | 55 ++
> >  1 file changed, 55 insertions(+)
> > 
> > diff --git a/docs/devel/code-provenance.rst b/docs/devel/code-provenance.rst
> > index 7c42fae571..eabb3e7c08 100644
> > --- a/docs/devel/code-provenance.rst
> > +++ b/docs/devel/code-provenance.rst
> > @@ -210,3 +210,58 @@ mailing list.
> >  It is also recommended to attempt to contact the original author to let 
> > them
> >  know you are interested in taking over their work, in case they still 
> > intended
> >  to return to the work, or had any suggestions about the best way to 
> > continue.
> > +
> > +Inclusion of generated files
> > +
> > +
> > +Files in patches contributed to QEMU are generally expected to be provided
> > +only in the preferred format for making modifications. The implication of
> > +this is that the output of code generators or compilers is usually not
> > +appropriate to contribute to QEMU.
> > +
> > +For reasons of practicality there are some exceptions to this rule, where
> > +generated code is permitted, provided it is also accompanied by the
> > +corresponding preferred source format. This is done where it is impractical
> > +to expect those building QEMU to run the code generation or compilation
> > +process. A non-exhustive list of examples is:
> > +
> > + * Images: where an bitmap image is created from a vector file it is common
> > +   to include the rendered bitmaps at desired resolution(s), since subtle
> > +   changes in the rasterization process / tools may affect quality. The
> > +   original vector file is expected to accompany any generated bitmaps.
> > +
> > + * Firmware: QEMU includes pre-compiled binary ROMs for a variety of guest
> > +   firmwares. When such binary ROMs are contributed, the corresponding 
> > source
> > +   must also be provided, either directly, or through a git submodule link.
> > +
> > + * Dockerfiles: the majority of the dockerfiles are automatically generated
> > +   from a canonical list of build dependencies maintained in tree, together
> > +   with the libvirt-ci git submodule link. The generated dockerfiles are
> > +   included in tree because it is desirable to be able to directly build
> > +   container images from a clean git checkout.
> > +
> > + * EBPF: QEMU includes some generated EBPF machine code, since the required
> > +   eBPF compilation tools are not broadly available on all targetted OS
> > +   distributions. The corresponding eBPF C code for the binary is also
> > +   provided. This is a time limited exception until the eBPF toolchain is
> > +   sufficiently broadly available in distros.
> > +
> > +In all cases above, the existence of generated files must be acknowledged
> > +and justified in the commit that introduces them.
> > +
> > +Tools which perform changes to existing code with deterministic algorithmic
> > +manipulation, driven by user specified inputs, are not generally considered
> > +to be "generators".
> > +
> > +IOW, using coccinelle to convert code from one pattern to another pattern, 
> > or
> > +fixing docs typos with a spell checker, or transforming code using sed / 
> > awk /
> > +etc, are not considered to be acts of code generation. Where an automated
> > +manipulation is performed on code, however, this should be declared in the
> > +commit message.
> > +
> > +At times contributors may use or create scripts/tools to generate an 
> > initial
> > +boilerplate code template which is

RE: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe devices that support SVM

2024-05-17 Thread Duan, Zhenzhong



>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v2 21/25] atc: generic ATC that can be used by PCIe
>devices that support SVM
>
>As the SVM-capable devices will need to cache translations, we provide
>an first implementation.
>
>This cache uses a two-level design based on hash tables.
>The first level is indexed by a PASID and the second by a virtual addresse.
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> tests/unit/meson.build |   1 +
> tests/unit/test-atc.c  | 502
>+
> util/atc.c | 211 +
> util/atc.h | 117 ++
> util/meson.build   |   1 +
> 5 files changed, 832 insertions(+)
> create mode 100644 tests/unit/test-atc.c
> create mode 100644 util/atc.c
> create mode 100644 util/atc.h

Maybe the unit test can be split from functional change?

>
>diff --git a/tests/unit/meson.build b/tests/unit/meson.build
>index 228a21d03c..5c9a6fe9f4 100644
>--- a/tests/unit/meson.build
>+++ b/tests/unit/meson.build
>@@ -52,6 +52,7 @@ tests = {
>   'test-interval-tree': [],
>   'test-xs-node': [qom],
>   'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-
>dmabuf.c'],
>+  'test-atc': []
> }
>
> if have_system or have_tools
>diff --git a/tests/unit/test-atc.c b/tests/unit/test-atc.c
>new file mode 100644
>index 00..60fa60924a
>--- /dev/null
>+++ b/tests/unit/test-atc.c
>@@ -0,0 +1,502 @@
>+/*
>+ * This program is free software; you can redistribute it and/or modify
>+ * it under the terms of the GNU General Public License as published by
>+ * the Free Software Foundation; either version 2 of the License, or
>+ * (at your option) any later version.
>+
>+ * This program is distributed in the hope that it will be useful,
>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>+ * GNU General Public License for more details.
>+
>+ * You should have received a copy of the GNU General Public License along
>+ * with this program; if not, see .
>+ */
>+
>+#include "util/atc.h"
>+
>+static inline bool tlb_entry_equal(IOMMUTLBEntry *e1, IOMMUTLBEntry
>*e2)
>+{
>+if (!e1 || !e2) {
>+return !e1 && !e2;
>+}
>+return e1->iova == e2->iova &&
>+e1->addr_mask == e2->addr_mask &&
>+e1->pasid == e2->pasid &&
>+e1->perm == e2->perm &&
>+e1->target_as == e2->target_as &&
>+e1->translated_addr == e2->translated_addr;
>+}
>+
>+static void assert_lookup_equals(ATC *atc, IOMMUTLBEntry *target,
>+ uint32_t pasid, hwaddr iova)
>+{
>+IOMMUTLBEntry *result;
>+result = atc_lookup(atc, pasid, iova);
>+g_assert(tlb_entry_equal(result, target));
>+}
>+
>+static void check_creation(uint64_t page_size, uint8_t address_width,
>+   uint8_t levels, uint8_t level_offset,
>+   bool should_work) {
>+ATC *atc = atc_new(page_size, address_width);
>+if (atc) {
>+if (atc->levels != levels || atc->level_offset != level_offset) {
>+g_assert(false); /* ATC created but invalid configuration : fail 
>*/
>+}
>+atc_destroy(atc);
>+g_assert(should_work);
>+} else {
>+g_assert(!should_work);
>+}
>+}
>+
>+static void test_creation_parameters(void)
>+{
>+check_creation(8, 39, 3, 9, false);
>+check_creation(4095, 39, 3, 9, false);
>+check_creation(4097, 39, 3, 9, false);
>+check_creation(8192, 48, 0, 0, false);
>+
>+check_creation(4096, 38, 0, 0, false);
>+check_creation(4096, 39, 3, 9, true);
>+check_creation(4096, 40, 0, 0, false);
>+check_creation(4096, 47, 0, 0, false);
>+check_creation(4096, 48, 4, 9, true);
>+check_creation(4096, 49, 0, 0, false);
>+check_creation(4096, 56, 0, 0, false);
>+check_creation(4096, 57, 5, 9, true);
>+check_creation(4096, 58, 0, 0, false);
>+
>+check_creation(16384, 35, 0, 0, false);
>+check_creation(16384, 36, 2, 11, true);
>+check_creation(16384, 37, 0, 0, false);
>+check_creation(16384, 46, 0, 0, false);
>+check_creation(16384, 47, 3, 11, true);
>+check_creation(16384, 48, 0, 0, false);
>+check_creation(16384, 57, 0, 0, false);
>+check_creation(16384, 58, 4, 11, true);
>+check_creation(16384, 59, 0, 0, false);
>+}
>+
>+static void test_single_entry(void)
>+{
>+IOMMUTLBEntry entry = {
>+.iova = 0x123456789000ULL,
>+.addr_mask = 0xfffULL,
>+.pasid = 5,
>+.perm = IOMMU_RW,
>+.translated_addr = 0xdeadbeefULL,
>+};
>+
>+ATC *atc = atc_new(4096, 48);
>+g_assert(atc);
>+
>+assert_lookup_equals(atc, NULL, entry.pasid,
>+ entry.iova + (entry.addr_mask / 2));
>+
>+atc_create_address_space_cache(atc, entry.pasid);
>+g_assert(atc_update(atc, ) == 0);
>+
>+

RE: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when creating an instance of IOMMUTLBEntry

2024-05-17 Thread Duan, Zhenzhong



>-Original Message-
>From: CLEMENT MATHIEU--DRIF 
>Subject: [PATCH ats_vtd v2 20/25] intel_iommu: fill the PASID field when
>creating an instance of IOMMUTLBEntry
>
>Signed-off-by: Clément Mathieu--Drif 
>---
> hw/i386/intel_iommu.c | 7 +++
> 1 file changed, 7 insertions(+)
>
>diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>index 53f17d66c0..c4ebd4569e 100644
>--- a/hw/i386/intel_iommu.c
>+++ b/hw/i386/intel_iommu.c
>@@ -2299,6 +2299,7 @@ out:
> entry->translated_addr = vtd_get_slpte_addr(pte, s->aw_bits) &
>page_mask;
> entry->addr_mask = ~page_mask;
> entry->perm = access_flags;
>+entry->pasid = pasid;

For PCI_NO_PASID, do we want to assign PCI_NO_PASID or rid2pasid?

Thanks
Zhenzhong

> return true;
>
> error:
>@@ -2307,6 +2308,7 @@ error:
> entry->translated_addr = 0;
> entry->addr_mask = 0;
> entry->perm = IOMMU_NONE;
>+entry->pasid = PCI_NO_PASID;
> return false;
> }
>
>@@ -3497,6 +3499,7 @@ static void
>vtd_piotlb_pasid_invalidate_notify(IntelIOMMUState *s,
> event.entry.target_as = _space_memory;
> event.entry.iova = notifier->start;
> event.entry.perm = IOMMU_NONE;
>+event.entry.pasid = pasid;
> event.entry.addr_mask = notifier->end - notifier->start;
> event.entry.translated_addr = 0;
>
>@@ -3678,6 +3681,7 @@ static void
>vtd_piotlb_page_invalidate(IntelIOMMUState *s, uint16_t domain_id,
> event.entry.target_as = _space_memory;
> event.entry.iova = addr;
> event.entry.perm = IOMMU_NONE;
>+event.entry.pasid = pasid;
> event.entry.addr_mask = size - 1;
> event.entry.translated_addr = 0;
>
>@@ -4335,6 +4339,7 @@ static void
>do_invalidate_device_tlb(VTDAddressSpace *vtd_dev_as,
> event.entry.iova = addr;
> event.entry.perm = IOMMU_NONE;
> event.entry.translated_addr = 0;
>+event.entry.pasid = vtd_dev_as->pasid;
> memory_region_notify_iommu(_dev_as->iommu, 0, event);
> }
>
>@@ -4911,6 +4916,7 @@ static IOMMUTLBEntry
>vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
> IOMMUTLBEntry iotlb = {
> /* We'll fill in the rest later. */
> .target_as = _space_memory,
>+.pasid = vtd_as->pasid,
> };
> bool success;
>
>@@ -4923,6 +4929,7 @@ static IOMMUTLBEntry
>vtd_iommu_translate(IOMMUMemoryRegion *iommu, hwaddr addr,
> iotlb.translated_addr = addr & VTD_PAGE_MASK_4K;
> iotlb.addr_mask = ~VTD_PAGE_MASK_4K;
> iotlb.perm = IOMMU_RW;
>+iotlb.pasid = PCI_NO_PASID;
> success = true;
> }
>
>--
>2.44.0

[PATCH] intel_iommu: Use the latest fault reasons defined by spec

2024-05-17 Thread Zhenzhong Duan

From: Yu Zhang 

Currently we use only VTD_FR_PASID_TABLE_INV as fault reason.
Update with more detailed fault reasons listed in VT-d spec 7.2.3.

Signed-off-by: Yu Zhang 
Signed-off-by: Zhenzhong Duan 
---
 hw/i386/intel_iommu_internal.h |  8 +++-
 hw/i386/intel_iommu.c  | 25 -
 2 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/hw/i386/intel_iommu_internal.h b/hw/i386/intel_iommu_internal.h
index f8cf99bddf..666e2cf2ce 100644
--- a/hw/i386/intel_iommu_internal.h
+++ b/hw/i386/intel_iommu_internal.h
@@ -311,7 +311,13 @@ typedef enum VTDFaultReason {
   * request while disabled */
 VTD_FR_IR_SID_ERR = 0x26,   /* Invalid Source-ID */
 
-VTD_FR_PASID_TABLE_INV = 0x58,  /*Invalid PASID table entry */
+/* PASID directory entry access failure */
+VTD_FR_PASID_DIR_ACCESS_ERR = 0x50,
+/* The Present(P) field of pasid directory entry is 0 */
+VTD_FR_PASID_DIR_ENTRY_P = 0x51,
+VTD_FR_PASID_TABLE_ACCESS_ERR = 0x58, /* PASID table entry access failure 
*/
+VTD_FR_PASID_ENTRY_P = 0x59, /* The Present(P) field of pasidt-entry is 0 
*/
+VTD_FR_PASID_TABLE_ENTRY_INV = 0x5b,  /*Invalid PASID table entry */
 
 /* Output address in the interrupt address range for scalable mode */
 VTD_FR_SM_INTERRUPT_ADDR = 0x87,
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index cc8e59674e..0951ebb71d 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -771,7 +771,7 @@ static int vtd_get_pdire_from_pdir_table(dma_addr_t 
pasid_dir_base,
 addr = pasid_dir_base + index * entry_size;
 if (dma_memory_read(_space_memory, addr,
 pdire, entry_size, MEMTXATTRS_UNSPECIFIED)) {
-return -VTD_FR_PASID_TABLE_INV;
+return -VTD_FR_PASID_DIR_ACCESS_ERR;
 }
 
 pdire->val = le64_to_cpu(pdire->val);
@@ -789,6 +789,7 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState 
*s,
   dma_addr_t addr,
   VTDPASIDEntry *pe)
 {
+uint8_t pgtt;
 uint32_t index;
 dma_addr_t entry_size;
 X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
@@ -798,7 +799,7 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState 
*s,
 addr = addr + index * entry_size;
 if (dma_memory_read(_space_memory, addr,
 pe, entry_size, MEMTXATTRS_UNSPECIFIED)) {
-return -VTD_FR_PASID_TABLE_INV;
+return -VTD_FR_PASID_TABLE_ACCESS_ERR;
 }
 for (size_t i = 0; i < ARRAY_SIZE(pe->val); i++) {
 pe->val[i] = le64_to_cpu(pe->val[i]);
@@ -806,11 +807,13 @@ static int vtd_get_pe_in_pasid_leaf_table(IntelIOMMUState 
*s,
 
 /* Do translation type check */
 if (!vtd_pe_type_check(x86_iommu, pe)) {
-return -VTD_FR_PASID_TABLE_INV;
+return -VTD_FR_PASID_TABLE_ENTRY_INV;
 }
 
-if (!vtd_is_level_supported(s, VTD_PE_GET_LEVEL(pe))) {
-return -VTD_FR_PASID_TABLE_INV;
+pgtt = VTD_PE_GET_TYPE(pe);
+if (pgtt == VTD_SM_PASID_ENTRY_SLT &&
+!vtd_is_level_supported(s, VTD_PE_GET_LEVEL(pe))) {
+return -VTD_FR_PASID_TABLE_ENTRY_INV;
 }
 
 return 0;
@@ -851,7 +854,7 @@ static int vtd_get_pe_from_pasid_table(IntelIOMMUState *s,
 }
 
 if (!vtd_pdire_present()) {
-return -VTD_FR_PASID_TABLE_INV;
+return -VTD_FR_PASID_DIR_ENTRY_P;
 }
 
 ret = vtd_get_pe_from_pdire(s, pasid, , pe);
@@ -860,7 +863,7 @@ static int vtd_get_pe_from_pasid_table(IntelIOMMUState *s,
 }
 
 if (!vtd_pe_present(pe)) {
-return -VTD_FR_PASID_TABLE_INV;
+return -VTD_FR_PASID_ENTRY_P;
 }
 
 return 0;
@@ -913,7 +916,7 @@ static int vtd_ce_get_pasid_fpd(IntelIOMMUState *s,
 }
 
 if (!vtd_pdire_present()) {
-return -VTD_FR_PASID_TABLE_INV;
+return -VTD_FR_PASID_DIR_ENTRY_P;
 }
 
 /*
@@ -1770,7 +1773,11 @@ static const bool vtd_qualified_faults[] = {
 [VTD_FR_ROOT_ENTRY_RSVD] = false,
 [VTD_FR_PAGING_ENTRY_RSVD] = true,
 [VTD_FR_CONTEXT_ENTRY_TT] = true,
-[VTD_FR_PASID_TABLE_INV] = false,
+[VTD_FR_PASID_DIR_ACCESS_ERR] = false,
+[VTD_FR_PASID_DIR_ENTRY_P] = true,
+[VTD_FR_PASID_TABLE_ACCESS_ERR] = false,
+[VTD_FR_PASID_ENTRY_P] = true,
+[VTD_FR_PASID_TABLE_ENTRY_INV] = true,
 [VTD_FR_SM_INTERRUPT_ADDR] = true,
 [VTD_FR_MAX] = false,
 };
-- 
2.34.1

Re: [PATCH] ui/sdl2: Allow host to power down screen

2024-05-17 Thread Daniel P . Berrangé

Cc stable - candidate for backport perhaps.

On Sun, May 12, 2024 at 11:59:45AM +0200, Bernhard Beschow wrote:
> By default, SDL disables the screen saver which prevents the host from 
> powering
> down the screen even if the screen is locked. This results in draining the
> battery needlessly when the host isn't connected to a wall charger. Fix that 
> by
> enabling the screen saver.
> 
> Signed-off-by: Bernhard Beschow 
> ---
>  ui/sdl2.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/ui/sdl2.c b/ui/sdl2.c
> index 4971963f00..0a0eb5a42d 100644
> --- a/ui/sdl2.c
> +++ b/ui/sdl2.c
> @@ -874,6 +874,7 @@ static void sdl2_display_init(DisplayState *ds, 
> DisplayOptions *o)
>  SDL_SetHint(SDL_HINT_ALLOW_ALT_TAB_WHILE_GRABBED, "0");
>  #endif
>  SDL_SetHint(SDL_HINT_WINDOWS_NO_CLOSE_ON_ALT_F4, "1");
> +SDL_EnableScreenSaver();
>  memset(, 0, sizeof(info));
>  SDL_VERSION();
>  
> -- 
> 2.45.0
> 
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: CXL numa error on arm64 qemu virt machine

2024-05-17 Thread Jonathan Cameron via

On Fri, 17 May 2024 18:07:07 +0800
Yuquan Wang  wrote:

> On Fri, May 10, 2024 at 06:16:46PM +0100, Jonathan Cameron wrote:
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/jic23/cxl-staging.git/log/?h=arm-numa-fixes
> >   
> Thank you :)
> > I've run out of time to sort out cover letters and things + just before the 
> > merge
> > window is never a good time get anyone to pay attention to potentially 
> > controversial
> > patches.  So for now I've thrown up a branch on kernel.org with Robert's
> > series of fixes of related code (that's queued in the ACPI tree for the 
> > merge window)
> > and Dan Williams (from several years ago) + my additions that 'work' 
> > (lightly tested)
> > on qemu/arm64 with the generic port patches etc. 
> > 
> > I'll send out an RFC in a couple of weeks.  In meantime let me know if you
> > run into any problems or have suggestions to improve them.
> > 
> > Jonathan
> >  
> With the latest commit(d077bf9) in the 'arm-numa-fixes', the qemu virt
> could create a cxl region with a new numa node (node 2) just like x86.
> At this stage(the first time to create cxl region), everything works
> fine.
> 
> However, if I use below commands to delete the created cxl region:
> 
> `daxctl offline-memory dax0.0`
> `cxl disable-region region0`
> `cxl destroy-region region0`
> 
> and then recreate it by `cxl create-region -d decoder0.0 -t ram`, the
> kernel could not create the numa node2 again, and the kernel will print:
> 
> [  589.458971] Fallback order for Node 0: 0 1
> [  589.459136] Fallback order for Node 1: 1 0
> [  589.459175] Fallback order for Node 2: 0 1
> [  589.459213] Built 2 zonelists, mobility grouping on.  Total pages: 1009890
> [  589.459284] Policy zone: Normal

I'll see if I can figure out what is happening there.
> 
> Meanwhile, the qemu reports that: 
> 
> "qemu-system-aarch64: virtio: bogus descriptor or out of resources"

That sounds like another TCG issue, or possibly the DMA bounce buffer
problem resurfacing.  It's not directly related to his NUMA aspect unless
something very odd is going on.  I'm even more confused because I think
you are not using kmem with the above commands, so we shouldn't be using
the CXL memory for virtio.

Just to check, you aren't running with KVM I hope?  That opens a much
bigger problem set. :(

Jonathan



> 
> Many thanks
> Yuquan
>

RE: [PATCH v2 1/4] accel/kvm: Extract common KVM vCPU {creation, parking} code

2024-05-17 Thread Salil Mehta via

Hi Nick,

>  From: Nicholas Piggin 
>  Sent: Friday, May 17, 2024 4:44 AM
>  
>  On Thu May 16, 2024 at 11:35 PM AEST, Salil Mehta wrote:
>  >
>  > >  From: Harsh Prateek Bora 
>  > >  Sent: Thursday, May 16, 2024 2:07 PM
>  > >
>  > >  Hi Salil,
>  > >
>  > >  On 5/16/24 17:42, Salil Mehta wrote:
>  > >  > Hi Harsh,
>  > >  >
>  > >  >>   From: Harsh Prateek Bora 
>  > >  >>   Sent: Thursday, May 16, 2024 11:15 AM
>  > >  >>
>  > >  >>   Hi Salil,
>  > >  >>
>  > >  >>   Thanks for your email.
>  > >  >>   Your patch 1/8 is included here based on review comments on my 
> previous
>  > >  >>   patch from one of the maintainers in the community and therefore I 
>  had
>  > >  >>   kept you in CC to be aware of the desire of having this 
> independent patch to
>  > >  >>   get merged earlier even if your other patches in the series may go 
> through
>  > >  >>   further reviews.
>  > >  >
>  > >  > I really don’t know which discussion are  you pointing at? Please
>  > > > understand you are fixing a bug and we are pushing a feature which has 
> got large series.
>  > >  > It will break the patch-set  which is about t be merged.
>  > >  >
>  > >  > There will be significant overhead of testing on us for the work
>  > > we  > have been carrying forward for large time. This will be 
> disruptive. Please dont!
>  > >  >
>  > >
>  > >  I was referring to the review discussion on my prev patch here:
>  > >
>  > > https://lore.kernel.org/qemu-devel/d191d2jfar7l.2eh4s445m4...@gmail.com/
>  >
>  >
>  > Sure, I'm, not sure what this means.
>  >
>  >
>  > >  Although your patch was included with this series only to
>  > > facilitate review of  the additional patches depending on just one of 
> your patch.
>  >
>  >
>  > Generally you rebase your patch-set over the other and clearly state
>  > on the cover letter that this patch-set is dependent upon such and
>  > such patch-set. Just imagine if everyone starts to unilaterally pick
>  > up patches from each other's patch-set it will create a chaos not only for
>  the feature owners but also for the maintainers.
>  >
>  >
>  > >
>  > >  I am not sure what is appearing disruptive here. It is a common
>  > > practive in  the community that maintainer(s) can pick individual
>  > > patches from the  series if it has been vetted by siginificant number of 
> reviewers.
>  >
>  >
>  > Don’t you think this patch-set is asking for acceptance for a patch
>  > already part of another patch-set which is about to be accepted and is a 
> bigger feature?
>  > Will it cause maintenance overhead at the last moment? Yes, of course!
>  >
>  >
>  > >  However, in this case, since you have mentioned to post next
>  > > version soon,  you need not worry about it as that would be the
>  > > preferred version for both  of the series.
>  >
>  >
>  > Yes, but please understand we are working for the benefit of overall 
> community.
>  > Please cooperate here.
>  
>  There might be a misunderstanding, Harsh just said there had not been
>  much progress on your series for a while and he wasn't sure what the status
>  was. I mentioned that we *could* take your patch 1 (with your
>  blessing) if there was a hold up with the rest of the series. He was going to
>  check in with you to see how it was going.


Thanks for the clarification. No issues. I'm planning to float V9 of this 
series by
Monday and perhaps that’s all you want. 

As such, new cycle started on 23rd April and we had been busy rebasing and
testing. This series works in conjunction with other series. We have to ensure 
both
are compatible.


>  This patch 1 was not intended to be merged as is without syncing up with
>  you first, but it's understandable you were concerned because that was
>  probably not communicated with you clearly.


No issues. I think we all are in the same page now. I understand your
requirement. We are trying our best to expedite acceptance of this series.
Perhaps your reviews on V9 might help.


>  
>  I appreciate you bringing up your concerns, we'll try to do better.

No problem. Thanks

Salil.

>  
>  Thanks,
>  Nick

Re: CXL numa error on arm64 qemu virt machine

2024-05-17 Thread Yuquan Wang

On Fri, May 10, 2024 at 06:16:46PM +0100, Jonathan Cameron wrote:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/jic23/cxl-staging.git/log/?h=arm-numa-fixes
> 
Thank you :)
> I've run out of time to sort out cover letters and things + just before the 
> merge
> window is never a good time get anyone to pay attention to potentially 
> controversial
> patches.  So for now I've thrown up a branch on kernel.org with Robert's
> series of fixes of related code (that's queued in the ACPI tree for the merge 
> window)
> and Dan Williams (from several years ago) + my additions that 'work' (lightly 
> tested)
> on qemu/arm64 with the generic port patches etc. 
> 
> I'll send out an RFC in a couple of weeks.  In meantime let me know if you
> run into any problems or have suggestions to improve them.
> 
> Jonathan
>
With the latest commit(d077bf9) in the 'arm-numa-fixes', the qemu virt
could create a cxl region with a new numa node (node 2) just like x86.
At this stage(the first time to create cxl region), everything works
fine.

However, if I use below commands to delete the created cxl region:

`daxctl offline-memory dax0.0`
`cxl disable-region region0`
`cxl destroy-region region0`

and then recreate it by `cxl create-region -d decoder0.0 -t ram`, the
kernel could not create the numa node2 again, and the kernel will print:

[  589.458971] Fallback order for Node 0: 0 1
[  589.459136] Fallback order for Node 1: 1 0
[  589.459175] Fallback order for Node 2: 0 1
[  589.459213] Built 2 zonelists, mobility grouping on.  Total pages: 1009890
[  589.459284] Policy zone: Normal

Meanwhile, the qemu reports that: 

"qemu-system-aarch64: virtio: bogus descriptor or out of resources"

Many thanks
Yuquan

Re: [PATCH v2 1/3] docs: introduce dedicated page about code provenance / sign-off

2024-05-17 Thread Daniel P . Berrangé

On Fri, May 17, 2024 at 07:05:05AM +0200, Thomas Huth wrote:
> On 16/05/2024 19.43, Peter Maydell wrote:
> > On Thu, 16 May 2024 at 18:34, Michael S. Tsirkin  wrote:
> > > 
> > > On Thu, May 16, 2024 at 06:29:39PM +0100, Peter Maydell wrote:
> > > > On Thu, 16 May 2024 at 17:22, Daniel P. Berrangé  
> > > > wrote:
> > > > > 
> > > > > Currently we have a short paragraph saying that patches must include
> > > > > a Signed-off-by line, and merely link to the kernel documentation.
> > > > > The linked kernel docs have a lot of content beyond the part about
> > > > > sign-off an thus are misleading/distracting to QEMU contributors.
> > > > 
> > > > Thanks for this -- I've felt for ages that it was a bit awkward
> > > > that we didn't have a good place to link people to for the fuller
> > > > explanation of this.
> > > > 
> > > > > This introduces a dedicated 'code-provenance' page in QEMU talking
> > > > > about why we require sign-off, explaining the other tags we commonly
> > > > > use, and what to do in some edge cases.
> > > > 
> > > > The version of the kernel SubmittingPatches we used to link to
> > > > includes the text "sorry, no pseudonyms or anonymous contributions".
> > > > This new documentation doesn't say anything either way about
> > > > our approach to pseudonyms. I think we should probably say
> > > > something, but I don't know if we have an in-practice consensus
> > > > there, so maybe we should approach that as a separate change on
> > > > top of this patch.
> > > 
> > > 
> > > Well given we referred to kernel previously then I guess that's
> > > the concensus, no?
> > 
> > AIUI the kernel devs have changed their point of view on the
> > pseudonym question, so it's a question of whether we were
> > deliberately referring to that specific revision of the kernel's
> > practice because we agreed with it or just by chance...
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=d4563201f33a022fc0353033d9dfeb1606a88330
> > 
> > is where the kernel changed to saying merely "no anonymous
> > contributions", dropping the 'pseudonyms' part.
> 
> FWIW, we had a clear statement in our document in the past:
> 
> https://gitlab.com/qemu-project/qemu/-/commit/ca127fe96ddb827f3ea153610c1e8f6e374708e2#9620a1442f724c9d8bfd5408e4611ba1839fcb8a_315_321
> 
> Quoting: "Please use your real name to sign a patch (not an alias or 
> acronym)."
> 
> But it got lost in that rework, I assume by accident?

Yeah, probably an oversight.

> So IMHO we had a consensus once to not allow anonymous contributions. I'm in
> favor of adding such a sentence back here now.

That text has been in the submitting-a-patch file since day 1, but that
content was originally a copy of the old wiki page, and the wiki edits
never had any formal peer review, so we should be wary of claiming too
much about a consensus.

Going back in history we can see the specific wording arrived with
this change:

  
https://wiki.qemu.org/index.php?title=Contribute%2FSubmitAPatch=revision=2173=2094

This may have been an informally held opinion amongst at least some
of those in the community at the time, but don't recall there was a
specific debate about the allowance of psuedonyms, etc.



I have traditionally been in favour of requiring real names, which I
had pretty much interpreted to imply a person's legal name. That was
mostly because I was following what I (apparently incorrectly) thought
was the kernel's intent in this respect.

Looking at the kernel commit above, I have sympathy with the view that
interpreting "real name" too strictly as a "legal name" is exclusionary.

Thus I'd be in favour of following the kernels' clarified intent, which
broadly aligns with the CNCF explanatory text, that "real name" can be
loosely interpreted to be "a commonly known identity in the community".

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 0/2] Zynq 7000 SoC improvements

2024-05-17 Thread Peter Maydell

On Fri, 17 May 2024 at 09:31, Sebastian Huber
 wrote:
>
> Hello,
>
> is the mailing list the right place for contributions like this?

Yes it is, and this is on my todo list to review. Sorry for
not getting back to you earlier, but I was on holiday last
week and at a conference this week. I hope to be able to start
working through my code review backlog when I'm at my desk
again next week :-)

-- PMM

[PATCH v3 07/11] hw/nvme: add helper functions for converting reservation types

2024-05-17 Thread Changqi Lu

This commit introduces two helper functions
that facilitate the conversion between the
reservation types used in the NVME protocol
and those used in the block layer.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 hw/nvme/nvme.h | 40 
 1 file changed, 40 insertions(+)

diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h
index bed8191bd5..6abe479410 100644
--- a/hw/nvme/nvme.h
+++ b/hw/nvme/nvme.h
@@ -474,6 +474,46 @@ static inline const char *nvme_io_opc_str(uint8_t opc)
 }
 }
 
+static inline NVMEResvType block_pr_type_to_nvme(BlockPrType type)
+{
+switch (type) {
+case BLK_PR_WRITE_EXCLUSIVE:
+return NVME_RESV_WRITE_EXCLUSIVE;
+case BLK_PR_EXCLUSIVE_ACCESS:
+return NVME_RESV_EXCLUSIVE_ACCESS;
+case BLK_PR_WRITE_EXCLUSIVE_REGS_ONLY:
+return NVME_RESV_WRITE_EXCLUSIVE_REGS_ONLY;
+case BLK_PR_EXCLUSIVE_ACCESS_REGS_ONLY:
+return NVME_RESV_EXCLUSIVE_ACCESS_REGS_ONLY;
+case BLK_PR_WRITE_EXCLUSIVE_ALL_REGS:
+return NVME_RESV_WRITE_EXCLUSIVE_ALL_REGS;
+case BLK_PR_EXCLUSIVE_ACCESS_ALL_REGS:
+return NVME_RESV_EXCLUSIVE_ACCESS_ALL_REGS;
+}
+
+return 0;
+}
+
+static inline BlockPrType nvme_pr_type_to_block(NVMEResvType type)
+{
+switch (type) {
+case NVME_RESV_WRITE_EXCLUSIVE:
+return BLK_PR_WRITE_EXCLUSIVE;
+case NVME_RESV_EXCLUSIVE_ACCESS:
+return BLK_PR_EXCLUSIVE_ACCESS;
+case NVME_RESV_WRITE_EXCLUSIVE_REGS_ONLY:
+return BLK_PR_WRITE_EXCLUSIVE_REGS_ONLY;
+case NVME_RESV_EXCLUSIVE_ACCESS_REGS_ONLY:
+return BLK_PR_EXCLUSIVE_ACCESS_REGS_ONLY;
+case NVME_RESV_WRITE_EXCLUSIVE_ALL_REGS:
+return BLK_PR_WRITE_EXCLUSIVE_ALL_REGS;
+case NVME_RESV_EXCLUSIVE_ACCESS_ALL_REGS:
+return BLK_PR_EXCLUSIVE_ACCESS_ALL_REGS;
+}
+
+return 0;
+}
+
 typedef struct NvmeSQueue {
 struct NvmeCtrl *ctrl;
 uint16_tsqid;
-- 
2.20.1

[PATCH v3 06/11] block/nvme: add reservation command protocol constants

2024-05-17 Thread Changqi Lu

Add constants for the NVMe persistent command protocol.
The constants include the reservation command opcode and
reservation type values defined in section 7 of the NVMe
2.0 specification.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 include/block/nvme.h | 61 
 1 file changed, 61 insertions(+)

diff --git a/include/block/nvme.h b/include/block/nvme.h
index bb231d0b9a..84e2b2e401 100644
--- a/include/block/nvme.h
+++ b/include/block/nvme.h
@@ -633,6 +633,10 @@ enum NvmeIoCommands {
 NVME_CMD_WRITE_ZEROES   = 0x08,
 NVME_CMD_DSM= 0x09,
 NVME_CMD_VERIFY = 0x0c,
+NVME_CMD_RESV_REGISTER  = 0x0d,
+NVME_CMD_RESV_REPORT= 0x0e,
+NVME_CMD_RESV_ACQUIRE   = 0x11,
+NVME_CMD_RESV_RELEASE   = 0x15,
 NVME_CMD_IO_MGMT_RECV   = 0x12,
 NVME_CMD_COPY   = 0x19,
 NVME_CMD_IO_MGMT_SEND   = 0x1d,
@@ -641,6 +645,63 @@ enum NvmeIoCommands {
 NVME_CMD_ZONE_APPEND= 0x7d,
 };
 
+typedef enum {
+NVME_RESV_REGISTER_ACTION_REGISTER  = 0x00,
+NVME_RESV_REGISTER_ACTION_UNREGISTER= 0x01,
+NVME_RESV_REGISTER_ACTION_REPLACE   = 0x02,
+} NVME_RESV_REGISTER_ACTION;
+
+typedef enum {
+NVME_RESV_RELEASE_ACTION_RELEASE= 0x00,
+NVME_RESV_RELEASE_ACTION_CLEAR  = 0x01,
+} NVME_RESV_RELEASE_ACTION;
+
+typedef enum {
+NVME_RESV_ACQUIRE_ACTION_ACQUIRE= 0x00,
+NVME_RESV_ACQUIRE_ACTION_PREEMPT= 0x01,
+NVME_RESV_ACQUIRE_ACTION_PREEMPT_AND_ABORT  = 0x02,
+} NVME_RESV_ACQUIRE_ACTION;
+
+typedef enum {
+NVME_RESV_WRITE_EXCLUSIVE   = 0x01,
+NVME_RESV_EXCLUSIVE_ACCESS  = 0x02,
+NVME_RESV_WRITE_EXCLUSIVE_REGS_ONLY = 0x03,
+NVME_RESV_EXCLUSIVE_ACCESS_REGS_ONLY= 0x04,
+NVME_RESV_WRITE_EXCLUSIVE_ALL_REGS  = 0x05,
+NVME_RESV_EXCLUSIVE_ACCESS_ALL_REGS = 0x06,
+} NVMEResvType;
+
+typedef enum {
+NVME_RESV_PTPL_NO_CHANGE = 0x00,
+NVME_RESV_PTPL_DISABLE   = 0x02,
+NVME_RESV_PTPL_ENABLE= 0x03,
+} NVMEResvPTPL;
+
+typedef enum NVMEPrCap {
+/* Persist Through Power Loss */
+NVME_PR_CAP_PTPL = 1 << 0,
+/* Write Exclusive reservation type */
+NVME_PR_CAP_WR_EX = 1 << 1,
+/* Exclusive Access reservation type */
+NVME_PR_CAP_EX_AC = 1 << 2,
+/* Write Exclusive Registrants Only reservation type */
+NVME_PR_CAP_WR_EX_RO = 1 << 3,
+/* Exclusive Access Registrants Only reservation type */
+NVME_PR_CAP_EX_AC_RO = 1 << 4,
+/* Write Exclusive All Registrants reservation type */
+NVME_PR_CAP_WR_EX_AR = 1 << 5,
+/* Exclusive Access All Registrants reservation type */
+NVME_PR_CAP_EX_AC_AR = 1 << 6,
+
+NVME_PR_CAP_ALL = (NVME_PR_CAP_PTPL |
+  NVME_PR_CAP_WR_EX |
+  NVME_PR_CAP_EX_AC |
+  NVME_PR_CAP_WR_EX_RO |
+  NVME_PR_CAP_EX_AC_RO |
+  NVME_PR_CAP_WR_EX_AR |
+  NVME_PR_CAP_EX_AC_AR),
+} NVMEPrCap;
+
 typedef struct QEMU_PACKED NvmeDeleteQ {
 uint8_t opcode;
 uint8_t flags;
-- 
2.20.1

[PATCH v3 08/11] hw/nvme: enable ONCS reservations

2024-05-17 Thread Changqi Lu

This commit enables ONCS to support the reservation
function at the controller level. It also lays the
groundwork for detecting and enabling the reservation
function on a per-namespace basis in RESCAP.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 hw/nvme/ctrl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 127c3d2383..182307a48b 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -8248,7 +8248,8 @@ static void nvme_init_ctrl(NvmeCtrl *n, PCIDevice 
*pci_dev)
 id->nn = cpu_to_le32(NVME_MAX_NAMESPACES);
 id->oncs = cpu_to_le16(NVME_ONCS_WRITE_ZEROES | NVME_ONCS_TIMESTAMP |
NVME_ONCS_FEATURES | NVME_ONCS_DSM |
-   NVME_ONCS_COMPARE | NVME_ONCS_COPY);
+   NVME_ONCS_COMPARE | NVME_ONCS_COPY |
+   NVME_ONCS_RESRVATIONS);
 
 /*
  * NOTE: If this device ever supports a command set that does NOT use 0x0
-- 
2.20.1

[PATCH v3 04/11] scsi/util: add helper functions for persistent reservation types conversion

2024-05-17 Thread Changqi Lu

This commit introduces two helper functions
that facilitate the conversion between the
persistent reservation types used in the SCSI
protocol and those used in the block layer.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 include/scsi/utils.h |  8 +
 scsi/utils.c | 81 
 2 files changed, 89 insertions(+)

diff --git a/include/scsi/utils.h b/include/scsi/utils.h
index d5c8efa16e..89a0b082fb 100644
--- a/include/scsi/utils.h
+++ b/include/scsi/utils.h
@@ -1,6 +1,8 @@
 #ifndef SCSI_UTILS_H
 #define SCSI_UTILS_H
 
+#include "block/block-common.h"
+#include "scsi/constants.h"
 #ifdef CONFIG_LINUX
 #include 
 #endif
@@ -135,6 +137,12 @@ uint32_t scsi_data_cdb_xfer(uint8_t *buf);
 uint32_t scsi_cdb_xfer(uint8_t *buf);
 int scsi_cdb_length(uint8_t *buf);
 
+BlockPrType scsi_pr_type_to_block(SCSIPrType type);
+SCSIPrType block_pr_type_to_scsi(BlockPrType type);
+
+uint8_t scsi_pr_cap_to_block(uint16_t scsi_pr_cap);
+uint16_t block_pr_cap_to_scsi(uint8_t block_pr_cap);
+
 /* Linux SG_IO interface.  */
 #ifdef CONFIG_LINUX
 #define SG_ERR_DRIVER_TIMEOUT  0x06
diff --git a/scsi/utils.c b/scsi/utils.c
index 357b036671..0dfdeb499d 100644
--- a/scsi/utils.c
+++ b/scsi/utils.c
@@ -658,3 +658,84 @@ int scsi_sense_from_host_status(uint8_t host_status,
 }
 return GOOD;
 }
+
+BlockPrType scsi_pr_type_to_block(SCSIPrType type)
+{
+switch (type) {
+case SCSI_PR_WRITE_EXCLUSIVE:
+return BLK_PR_WRITE_EXCLUSIVE;
+case SCSI_PR_EXCLUSIVE_ACCESS:
+return BLK_PR_EXCLUSIVE_ACCESS;
+case SCSI_PR_WRITE_EXCLUSIVE_REGS_ONLY:
+return BLK_PR_WRITE_EXCLUSIVE_REGS_ONLY;
+case SCSI_PR_EXCLUSIVE_ACCESS_REGS_ONLY:
+return BLK_PR_EXCLUSIVE_ACCESS_REGS_ONLY;
+case SCSI_PR_WRITE_EXCLUSIVE_ALL_REGS:
+return BLK_PR_WRITE_EXCLUSIVE_ALL_REGS;
+case SCSI_PR_EXCLUSIVE_ACCESS_ALL_REGS:
+return BLK_PR_EXCLUSIVE_ACCESS_ALL_REGS;
+}
+
+return 0;
+}
+
+SCSIPrType block_pr_type_to_scsi(BlockPrType type)
+{
+switch (type) {
+case BLK_PR_WRITE_EXCLUSIVE:
+return SCSI_PR_WRITE_EXCLUSIVE;
+case BLK_PR_EXCLUSIVE_ACCESS:
+return SCSI_PR_EXCLUSIVE_ACCESS;
+case BLK_PR_WRITE_EXCLUSIVE_REGS_ONLY:
+return SCSI_PR_WRITE_EXCLUSIVE_REGS_ONLY;
+case BLK_PR_EXCLUSIVE_ACCESS_REGS_ONLY:
+return SCSI_PR_EXCLUSIVE_ACCESS_REGS_ONLY;
+case BLK_PR_WRITE_EXCLUSIVE_ALL_REGS:
+return SCSI_PR_WRITE_EXCLUSIVE_ALL_REGS;
+case BLK_PR_EXCLUSIVE_ACCESS_ALL_REGS:
+return SCSI_PR_EXCLUSIVE_ACCESS_ALL_REGS;
+}
+
+return 0;
+}
+
+
+uint8_t scsi_pr_cap_to_block(uint16_t scsi_pr_cap)
+{
+uint8_t res = 0;
+
+res |= (scsi_pr_cap & SCSI_PR_CAP_WR_EX) ?
+   BLK_PR_CAP_WR_EX : 0;
+res |= (scsi_pr_cap & SCSI_PR_CAP_EX_AC) ?
+   BLK_PR_CAP_EX_AC : 0;
+res |= (scsi_pr_cap & SCSI_PR_CAP_WR_EX_RO) ?
+   BLK_PR_CAP_WR_EX_RO : 0;
+res |= (scsi_pr_cap & SCSI_PR_CAP_EX_AC_RO) ?
+   BLK_PR_CAP_EX_AC_RO : 0;
+res |= (scsi_pr_cap & SCSI_PR_CAP_WR_EX_AR) ?
+   BLK_PR_CAP_WR_EX_AR : 0;
+res |= (scsi_pr_cap & SCSI_PR_CAP_EX_AC_AR) ?
+   BLK_PR_CAP_EX_AC_AR : 0;
+
+return res;
+}
+
+uint16_t block_pr_cap_to_scsi(uint8_t block_pr_cap)
+{
+uint16_t res = 0;
+
+res |= (block_pr_cap & BLK_PR_CAP_WR_EX) ?
+  SCSI_PR_CAP_WR_EX : 0;
+res |= (block_pr_cap & BLK_PR_CAP_EX_AC) ?
+  SCSI_PR_CAP_EX_AC : 0;
+res |= (block_pr_cap & BLK_PR_CAP_WR_EX_RO) ?
+  SCSI_PR_CAP_WR_EX_RO : 0;
+res |= (block_pr_cap & BLK_PR_CAP_EX_AC_RO) ?
+  SCSI_PR_CAP_EX_AC_RO : 0;
+res |= (block_pr_cap & BLK_PR_CAP_WR_EX_AR) ?
+  SCSI_PR_CAP_WR_EX_AR : 0;
+res |= (block_pr_cap & BLK_PR_CAP_EX_AC_AR) ?
+  SCSI_PR_CAP_EX_AC_AR : 0;
+
+return res;
+}
-- 
2.20.1

[PATCH v3 11/11] block/iscsi: add persistent reservation in/out driver

2024-05-17 Thread Changqi Lu

Add persistent reservation in/out operations for iscsi driver.
The following methods are implemented: bdrv_co_pr_read_keys,
bdrv_co_pr_read_reservation, bdrv_co_pr_register, bdrv_co_pr_reserve,
bdrv_co_pr_release, bdrv_co_pr_clear and bdrv_co_pr_preempt.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 block/iscsi.c | 443 ++
 1 file changed, 443 insertions(+)

diff --git a/block/iscsi.c b/block/iscsi.c
index 2ff14b7472..d94ebe35bd 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -96,6 +96,7 @@ typedef struct IscsiLun {
 unsigned long *allocmap_valid;
 long allocmap_size;
 int cluster_size;
+uint8_t pr_cap;
 bool use_16_for_rw;
 bool write_protected;
 bool lbpme;
@@ -280,6 +281,8 @@ iscsi_co_generic_cb(struct iscsi_context *iscsi, int status,
 iTask->err_code = -error;
 iTask->err_str = g_strdup(iscsi_get_error(iscsi));
 }
+} else if (status == SCSI_STATUS_RESERVATION_CONFLICT) {
+iTask->err_code = -EBADE;
 }
 }
 }
@@ -1792,6 +1795,52 @@ static void iscsi_save_designator(IscsiLun *lun,
 }
 }
 
+static void iscsi_get_pr_cap_sync(IscsiLun *iscsilun, Error **errp)
+{
+struct scsi_task *task = NULL;
+struct scsi_persistent_reserve_in_report_capabilities *rc = NULL;
+int retries = ISCSI_CMD_RETRIES;
+int xferlen = sizeof(struct 
scsi_persistent_reserve_in_report_capabilities);
+
+do {
+if (task != NULL) {
+scsi_free_scsi_task(task);
+task = NULL;
+}
+
+task = iscsi_persistent_reserve_in_sync(iscsilun->iscsi,
+   iscsilun->lun, SCSI_PR_IN_REPORT_CAPABILITIES, xferlen);
+if (task != NULL && task->status == SCSI_STATUS_GOOD) {
+rc = scsi_datain_unmarshall(task);
+if (rc == NULL) {
+error_setg(errp,
+"iSCSI: Failed to unmarshall report capabilities data.");
+} else {
+iscsilun->pr_cap =
+scsi_pr_cap_to_block(rc->persistent_reservation_type_mask);
+iscsilun->pr_cap |= (rc->ptpl_a) ? BLK_PR_CAP_PTPL : 0;
+}
+break;
+}
+
+if (task != NULL && task->status == SCSI_STATUS_CHECK_CONDITION
+&& task->sense.key == SCSI_SENSE_UNIT_ATTENTION) {
+break;
+}
+
+} while (task != NULL && task->status == SCSI_STATUS_CHECK_CONDITION
+ && task->sense.key == SCSI_SENSE_UNIT_ATTENTION
+ && retries-- > 0);
+
+if (task == NULL || task->status != SCSI_STATUS_GOOD) {
+error_setg(errp, "iSCSI: failed to send report capabilities command");
+}
+
+if (task) {
+scsi_free_scsi_task(task);
+}
+}
+
 static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
   Error **errp)
 {
@@ -2024,6 +2073,11 @@ static int iscsi_open(BlockDriverState *bs, QDict 
*options, int flags,
 bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
 }
 
+iscsi_get_pr_cap_sync(iscsilun, _err);
+if (local_err != NULL) {
+error_propagate(errp, local_err);
+ret = -EINVAL;
+}
 out:
 qemu_opts_del(opts);
 g_free(initiator_name);
@@ -2110,6 +2164,8 @@ static void iscsi_refresh_limits(BlockDriverState *bs, 
Error **errp)
 bs->bl.opt_transfer = pow2floor(iscsilun->bl.opt_xfer_len *
 iscsilun->block_size);
 }
+
+bs->bl.pr_cap = iscsilun->pr_cap;
 }
 
 /* Note that this will not re-establish a connection with an iSCSI target - it
@@ -2408,6 +2464,385 @@ out_unlock:
 return r;
 }
 
+static int coroutine_fn
+iscsi_co_pr_read_keys(BlockDriverState *bs, uint32_t *generation,
+  uint32_t num_keys, uint64_t *keys)
+{
+IscsiLun *iscsilun = bs->opaque;
+QEMUIOVector qiov;
+struct IscsiTask iTask;
+int xferlen = sizeof(struct scsi_persistent_reserve_in_read_keys) +
+  sizeof(uint64_t) * num_keys;
+uint8_t *buf = g_malloc0(xferlen);
+int32_t num_collect_keys = 0;
+int r = 0;
+
+qemu_iovec_init_buf(, buf, xferlen);
+iscsi_co_init_iscsitask(iscsilun, );
+qemu_mutex_lock(>mutex);
+retry:
+iTask.task = iscsi_persistent_reserve_in_task(iscsilun->iscsi,
+ iscsilun->lun, SCSI_PR_IN_READ_KEYS, xferlen,
+ iscsi_co_generic_cb, );
+
+if (iTask.task == NULL) {
+qemu_mutex_unlock(>mutex);
+return -ENOMEM;
+}
+
+scsi_task_set_iov_in(iTask.task, (struct scsi_iovec *)qiov.iov, qiov.niov);
+iscsi_co_wait_for_task(, iscsilun);
+
+if (iTask.task != NULL) {
+scsi_free_scsi_task(iTask.task);
+iTask.task = NULL;
+}
+
+if (iTask.do_retry) {
+iTask.complete = 0;
+goto retry;
+}
+
+if (iTask.status != SCSI_STATUS_GOOD) {
+

[PATCH v3 09/11] hw/nvme: enable namespace rescap function

2024-05-17 Thread Changqi Lu

This commit enables the rescap function in the
namespace by detecting the supported reservation
function in the backend driver.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 hw/nvme/ns.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
index ea8db175db..bb09117f4b 100644
--- a/hw/nvme/ns.c
+++ b/hw/nvme/ns.c
@@ -20,6 +20,7 @@
 #include "qemu/bitops.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/block-backend.h"
+#include "block/block_int.h"
 
 #include "nvme.h"
 #include "trace.h"
@@ -55,6 +56,13 @@ void nvme_ns_init_format(NvmeNamespace *ns)
 }
 
 id_ns->npda = id_ns->npdg = npdg - 1;
+
+/*
+ * The persistent reservation capacities of block
+ * and nvme are currently defined the same.
+ * If there are subsequent changes, this part needs to be changed.
+ */
+id_ns->rescap = blk_bs(ns->blkconf.blk)->file->bs->bl.pr_cap;
 }
 
 static int nvme_ns_init(NvmeNamespace *ns, Error **errp)
-- 
2.20.1

[PATCH v3 10/11] hw/nvme: add reservation protocal command

2024-05-17 Thread Changqi Lu

Add reservation acquire, reservation register,
reservation release and reservation report commands
in the nvme device layer.

By introducing these commands, this enables the nvme
device to perform reservation-related tasks, including
querying keys, querying reservation status, registering
reservation keys, initiating and releasing reservations,
as well as clearing and preempting reservations held by
other keys.

These commands are crucial for management and control of
shared storage resources in a persistent manner.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 hw/nvme/ctrl.c   | 321 ++-
 hw/nvme/nvme.h   |   4 +
 include/block/nvme.h |  38 +
 3 files changed, 362 insertions(+), 1 deletion(-)

diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index 182307a48b..ac2fbd22ec 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -294,6 +294,10 @@ static const uint32_t nvme_cse_iocs_nvm[256] = {
 [NVME_CMD_COMPARE]  = NVME_CMD_EFF_CSUPP,
 [NVME_CMD_IO_MGMT_RECV] = NVME_CMD_EFF_CSUPP,
 [NVME_CMD_IO_MGMT_SEND] = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
+[NVME_CMD_RESV_REGISTER]= NVME_CMD_EFF_CSUPP,
+[NVME_CMD_RESV_REPORT]  = NVME_CMD_EFF_CSUPP,
+[NVME_CMD_RESV_ACQUIRE] = NVME_CMD_EFF_CSUPP,
+[NVME_CMD_RESV_RELEASE] = NVME_CMD_EFF_CSUPP,
 };
 
 static const uint32_t nvme_cse_iocs_zoned[256] = {
@@ -308,6 +312,10 @@ static const uint32_t nvme_cse_iocs_zoned[256] = {
 [NVME_CMD_ZONE_APPEND]  = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
 [NVME_CMD_ZONE_MGMT_SEND]   = NVME_CMD_EFF_CSUPP | NVME_CMD_EFF_LBCC,
 [NVME_CMD_ZONE_MGMT_RECV]   = NVME_CMD_EFF_CSUPP,
+[NVME_CMD_RESV_REGISTER]= NVME_CMD_EFF_CSUPP,
+[NVME_CMD_RESV_REPORT]  = NVME_CMD_EFF_CSUPP,
+[NVME_CMD_RESV_ACQUIRE] = NVME_CMD_EFF_CSUPP,
+[NVME_CMD_RESV_RELEASE] = NVME_CMD_EFF_CSUPP,
 };
 
 static void nvme_process_sq(void *opaque);
@@ -1745,6 +1753,7 @@ static void nvme_aio_err(NvmeRequest *req, int ret)
 
 switch (req->cmd.opcode) {
 case NVME_CMD_READ:
+case NVME_CMD_RESV_REPORT:
 status = NVME_UNRECOVERED_READ;
 break;
 case NVME_CMD_FLUSH:
@@ -1752,6 +1761,9 @@ static void nvme_aio_err(NvmeRequest *req, int ret)
 case NVME_CMD_WRITE_ZEROES:
 case NVME_CMD_ZONE_APPEND:
 case NVME_CMD_COPY:
+case NVME_CMD_RESV_REGISTER:
+case NVME_CMD_RESV_ACQUIRE:
+case NVME_CMD_RESV_RELEASE:
 status = NVME_WRITE_FAULT;
 break;
 default:
@@ -2127,7 +2139,10 @@ static inline bool nvme_is_write(NvmeRequest *req)
 
 return rw->opcode == NVME_CMD_WRITE ||
rw->opcode == NVME_CMD_ZONE_APPEND ||
-   rw->opcode == NVME_CMD_WRITE_ZEROES;
+   rw->opcode == NVME_CMD_WRITE_ZEROES ||
+   rw->opcode == NVME_CMD_RESV_REGISTER ||
+   rw->opcode == NVME_CMD_RESV_ACQUIRE ||
+   rw->opcode == NVME_CMD_RESV_RELEASE;
 }
 
 static void nvme_misc_cb(void *opaque, int ret)
@@ -2692,6 +2707,302 @@ static uint16_t nvme_verify(NvmeCtrl *n, NvmeRequest 
*req)
 return NVME_NO_COMPLETE;
 }
 
+typedef struct NvmeKeyInfo {
+uint64_t cr_key;
+uint64_t nr_key;
+} NvmeKeyInfo;
+
+static uint16_t nvme_resv_register(NvmeCtrl *n, NvmeRequest *req)
+{
+int ret;
+NvmeKeyInfo key_info;
+NvmeNamespace *ns = req->ns;
+uint32_t cdw10 = le32_to_cpu(req->cmd.cdw10);
+bool ignore_key = cdw10 >> 3 & 0x1;
+uint8_t action = cdw10 & 0x7;
+uint8_t ptpl = cdw10 >> 30 & 0x3;
+bool aptpl;
+
+switch (ptpl) {
+case NVME_RESV_PTPL_NO_CHANGE:
+aptpl = (ns->id_ns.rescap & NVME_PR_CAP_PTPL) ? true : false;
+break;
+case NVME_RESV_PTPL_DISABLE:
+aptpl = false;
+break;
+case NVME_RESV_PTPL_ENABLE:
+aptpl = true;
+break;
+default:
+return NVME_INVALID_FIELD;
+}
+
+ret = nvme_h2c(n, (uint8_t *)_info, sizeof(NvmeKeyInfo), req);
+if (ret) {
+return ret;
+}
+
+switch (action) {
+case NVME_RESV_REGISTER_ACTION_REGISTER:
+req->aiocb = blk_aio_pr_register(ns->blkconf.blk, 0,
+ key_info.nr_key, 0, aptpl,
+ ignore_key, nvme_misc_cb,
+ req);
+break;
+case NVME_RESV_REGISTER_ACTION_UNREGISTER:
+req->aiocb = blk_aio_pr_register(ns->blkconf.blk, key_info.cr_key, 0,
+ 0, aptpl, ignore_key,
+ nvme_misc_cb, req);
+break;
+case NVME_RESV_REGISTER_ACTION_REPLACE:
+req->aiocb = blk_aio_pr_register(ns->blkconf.blk, key_info.cr_key,
+ key_info.nr_key, 0, aptpl, ignore_key,
+ nvme_misc_cb, req);
+break;
+default:
+return

[PATCH v3 05/11] hw/scsi: add persistent reservation in/out api for scsi device

2024-05-17 Thread Changqi Lu

Add persistent reservation in/out operations in the
SCSI device layer. By introducing the persistent
reservation in/out api, this enables the SCSI device
to perform reservation-related tasks, including querying
keys, querying reservation status, registering reservation
keys, initiating and releasing reservations, as well as
clearing and preempting reservations held by other keys.

These operations are crucial for management and control of
shared storage resources in a persistent manner.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 hw/scsi/scsi-disk.c | 352 
 1 file changed, 352 insertions(+)

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 4bd7af9d0c..0e964dbd87 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -32,6 +32,7 @@
 #include "migration/vmstate.h"
 #include "hw/scsi/emulation.h"
 #include "scsi/constants.h"
+#include "scsi/utils.h"
 #include "sysemu/block-backend.h"
 #include "sysemu/blockdev.h"
 #include "hw/block/block.h"
@@ -42,6 +43,7 @@
 #include "qemu/cutils.h"
 #include "trace.h"
 #include "qom/object.h"
+#include "block/block_int.h"
 
 #ifdef __linux
 #include 
@@ -1474,6 +1476,346 @@ static void scsi_disk_emulate_read_data(SCSIRequest 
*req)
 scsi_req_complete(>req, GOOD);
 }
 
+typedef struct SCSIPrReadKeys {
+uint32_t generation;
+uint32_t num_keys;
+uint64_t *keys;
+void *req;
+} SCSIPrReadKeys;
+
+typedef struct SCSIPrReadReservation {
+uint32_t generation;
+uint64_t key;
+BlockPrType type;
+void *req;
+} SCSIPrReadReservation;
+
+static void scsi_pr_read_keys_complete(void *opaque, int ret)
+{
+int num_keys;
+uint8_t *buf;
+SCSIPrReadKeys *blk_keys = (SCSIPrReadKeys *)opaque;
+SCSIDiskReq *r = (SCSIDiskReq *)blk_keys->req;
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+
+assert(blk_get_aio_context(s->qdev.conf.blk) ==
+qemu_get_current_aio_context());
+
+assert(r->req.aiocb != NULL);
+r->req.aiocb = NULL;
+
+if (scsi_disk_req_check_error(r, ret, true)) {
+goto done;
+}
+
+buf = scsi_req_get_buf(>req);
+num_keys = MIN(blk_keys->num_keys, ret);
+blk_keys->generation = cpu_to_be32(blk_keys->generation);
+memcpy([0], _keys->generation, 4);
+for (int i = 0; i < num_keys; i++) {
+blk_keys->keys[i] = cpu_to_be64(blk_keys->keys[i]);
+memcpy([8 + i * 8], _keys->keys[i], 8);
+}
+num_keys = cpu_to_be32(num_keys * 8);
+memcpy([4], _keys, 4);
+
+scsi_req_data(>req, r->buflen);
+done:
+scsi_req_unref(>req);
+g_free(blk_keys->keys);
+g_free(blk_keys);
+}
+
+static int scsi_disk_emulate_pr_read_keys(SCSIRequest *req)
+{
+SCSIPrReadKeys *blk_keys;
+SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev);
+int buflen = MIN(r->req.cmd.xfer, r->buflen);
+int num_keys = (buflen - sizeof(uint32_t) * 2) / sizeof(uint64_t);
+
+blk_keys = g_new0(SCSIPrReadKeys, 1);
+blk_keys->generation = 0;
+/* num_keys is the maximum number of keys that can be transmitted */
+blk_keys->num_keys = num_keys;
+blk_keys->keys = g_malloc(sizeof(uint64_t) * num_keys);
+blk_keys->req = r;
+
+/* The request is used as the AIO opaque value, so add a ref.  */
+scsi_req_ref(>req);
+r->req.aiocb = blk_aio_pr_read_keys(s->qdev.conf.blk, 
_keys->generation,
+blk_keys->num_keys, blk_keys->keys,
+scsi_pr_read_keys_complete, blk_keys);
+return 0;
+}
+
+static void scsi_pr_read_reservation_complete(void *opaque, int ret)
+{
+uint8_t *buf;
+uint32_t additional_len = 0;
+SCSIPrReadReservation *blk_rsv = (SCSIPrReadReservation *)opaque;
+SCSIDiskReq *r = (SCSIDiskReq *)blk_rsv->req;
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, r->req.dev);
+
+assert(blk_get_aio_context(s->qdev.conf.blk) ==
+qemu_get_current_aio_context());
+
+assert(r->req.aiocb != NULL);
+r->req.aiocb = NULL;
+
+if (scsi_disk_req_check_error(r, ret, true)) {
+goto done;
+}
+
+buf = scsi_req_get_buf(>req);
+blk_rsv->generation = cpu_to_be32(blk_rsv->generation);
+memcpy([0], _rsv->generation, 4);
+if (ret) {
+additional_len = cpu_to_be32(16);
+blk_rsv->key = cpu_to_be64(blk_rsv->key);
+memcpy([8], _rsv->key, 8);
+buf[21] = block_pr_type_to_scsi(blk_rsv->type) & 0xf;
+} else {
+additional_len = cpu_to_be32(0);
+}
+
+memcpy([4], _len, 4);
+scsi_req_data(>req, r->buflen);
+
+done:
+scsi_req_unref(>req);
+g_free(blk_rsv);
+}
+
+static int scsi_disk_emulate_pr_read_reservation(SCSIRequest *req)
+{
+SCSIPrReadReservation *blk_rsv;
+SCSIDiskReq *r = DO_UPCAST(SCSIDiskReq, req, req);
+SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, req->dev);
+
+blk_rsv =

[PATCH v3 03/11] scsi/constant: add persistent reservation in/out protocol constants

2024-05-17 Thread Changqi Lu

Add constants for the persistent reservation in/out protocol
in the scsi/constant module. The constants include the persistent
reservation command, type, and scope values defined in sections
6.13 and 6.14 of the SCSI Primary Commands-4 (SPC-4) specification.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 include/scsi/constants.h | 52 
 1 file changed, 52 insertions(+)

diff --git a/include/scsi/constants.h b/include/scsi/constants.h
index 9b98451912..922a314535 100644
--- a/include/scsi/constants.h
+++ b/include/scsi/constants.h
@@ -319,4 +319,56 @@
 #define IDENT_DESCR_TGT_DESCR_SIZE 32
 #define XCOPY_BLK2BLK_SEG_DESC_SIZE 28
 
+typedef enum {
+SCSI_PR_WRITE_EXCLUSIVE = 0x01,
+SCSI_PR_EXCLUSIVE_ACCESS= 0x03,
+SCSI_PR_WRITE_EXCLUSIVE_REGS_ONLY   = 0x05,
+SCSI_PR_EXCLUSIVE_ACCESS_REGS_ONLY  = 0x06,
+SCSI_PR_WRITE_EXCLUSIVE_ALL_REGS= 0x07,
+SCSI_PR_EXCLUSIVE_ACCESS_ALL_REGS   = 0x08,
+} SCSIPrType;
+
+typedef enum {
+SCSI_PR_LU_SCOPE  = 0x00,
+} SCSIPrScope;
+
+typedef enum {
+SCSI_PR_OUT_REGISTER = 0x0,
+SCSI_PR_OUT_RESERVE  = 0x1,
+SCSI_PR_OUT_RELEASE  = 0x2,
+SCSI_PR_OUT_CLEAR= 0x3,
+SCSI_PR_OUT_PREEMPT  = 0x4,
+SCSI_PR_OUT_PREEMPT_AND_ABORT= 0x5,
+SCSI_PR_OUT_REG_AND_IGNORE_KEY   = 0x6,
+SCSI_PR_OUT_REG_AND_MOVE = 0x7,
+} SCSIPrOutAction;
+
+typedef enum {
+SCSI_PR_IN_READ_KEYS = 0x0,
+SCSI_PR_IN_READ_RESERVATION  = 0x1,
+SCSI_PR_IN_REPORT_CAPABILITIES   = 0x2,
+} SCSIPrInAction;
+
+typedef enum {
+/* Exclusive Access All Registrants reservation type */
+SCSI_PR_CAP_EX_AC_AR = 1 << 0,
+/* Write Exclusive reservation type */
+SCSI_PR_CAP_WR_EX = 1 << 9,
+/* Exclusive Access reservation type */
+SCSI_PR_CAP_EX_AC = 1 << 11,
+/* Write Exclusive Registrants Only reservation type */
+SCSI_PR_CAP_WR_EX_RO = 1 << 13,
+/* Exclusive Access Registrants Only reservation type */
+SCSI_PR_CAP_EX_AC_RO = 1 << 14,
+/* Write Exclusive All Registrants reservation type */
+SCSI_PR_CAP_WR_EX_AR = 1 << 15,
+
+SCSI_PR_CAP_ALL = (SCSI_PR_CAP_EX_AC_AR |
+  SCSI_PR_CAP_WR_EX |
+  SCSI_PR_CAP_EX_AC |
+  SCSI_PR_CAP_WR_EX_RO |
+  SCSI_PR_CAP_EX_AC_RO |
+  SCSI_PR_CAP_WR_EX_AR),
+} SCSIPrCap;
+
 #endif
-- 
2.20.1

[PATCH v3 01/11] block: add persistent reservation in/out api

2024-05-17 Thread Changqi Lu

Add persistent reservation in/out operations
at the block level. The following operations
are included:

- read_keys:retrieves the list of registered keys.
- read_reservation: retrieves the current reservation status.
- register: registers a new reservation key.
- reserve:  initiates a reservation for a specific key.
- release:  releases a reservation for a specific key.
- clear:clears all existing reservations.
- preempt:  preempts a reservation held by another key.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 block/block-backend.c | 397 ++
 block/io.c| 163 
 include/block/block-common.h  |  40 +++
 include/block/block-io.h  |  20 ++
 include/block/block_int-common.h  |  84 +++
 include/sysemu/block-backend-io.h |  24 ++
 6 files changed, 728 insertions(+)

diff --git a/block/block-backend.c b/block/block-backend.c
index db6f9b92a3..6707d94df7 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1770,6 +1770,403 @@ BlockAIOCB *blk_aio_ioctl(BlockBackend *blk, unsigned 
long int req, void *buf,
 return blk_aio_prwv(blk, req, 0, buf, blk_aio_ioctl_entry, 0, cb, opaque);
 }
 
+typedef struct BlkPrInCo {
+BlockBackend *blk;
+uint32_t *generation;
+uint32_t num_keys;
+BlockPrType *type;
+uint64_t *keys;
+int ret;
+} BlkPrInCo;
+
+typedef struct BlkPrInCB {
+BlockAIOCB common;
+BlkPrInCo prco;
+bool has_returned;
+} BlkPrInCB;
+
+static const AIOCBInfo blk_pr_in_aiocb_info = {
+.aiocb_size = sizeof(BlkPrInCB),
+};
+
+static void blk_pr_in_complete(BlkPrInCB *acb)
+{
+if (acb->has_returned) {
+acb->common.cb(acb->common.opaque, acb->prco.ret);
+blk_dec_in_flight(acb->prco.blk);
+qemu_aio_unref(acb);
+}
+}
+
+static void blk_pr_in_complete_bh(void *opaque)
+{
+BlkPrInCB *acb = opaque;
+assert(acb->has_returned);
+blk_pr_in_complete(acb);
+}
+
+static BlockAIOCB *blk_aio_pr_in(BlockBackend *blk, uint32_t *generation,
+ uint32_t num_keys, BlockPrType *type,
+ uint64_t *keys, CoroutineEntry co_entry,
+ BlockCompletionFunc *cb, void *opaque)
+{
+BlkPrInCB *acb;
+Coroutine *co;
+
+blk_inc_in_flight(blk);
+acb = blk_aio_get(_pr_in_aiocb_info, blk, cb, opaque);
+acb->prco = (BlkPrInCo) {
+.blk= blk,
+.generation = generation,
+.num_keys   = num_keys,
+.type   = type,
+.ret= NOT_DONE,
+.keys   = keys,
+};
+acb->has_returned = false;
+
+co = qemu_coroutine_create(co_entry, acb);
+aio_co_enter(qemu_get_current_aio_context(), co);
+
+acb->has_returned = true;
+if (acb->prco.ret != NOT_DONE) {
+replay_bh_schedule_oneshot_event(qemu_get_current_aio_context(),
+ blk_pr_in_complete_bh, acb);
+}
+
+return >common;
+}
+
+/* To be called between exactly one pair of blk_inc/dec_in_flight() */
+static int coroutine_fn
+blk_aio_pr_do_read_keys(BlockBackend *blk, uint32_t *generation,
+uint32_t num_keys, uint64_t *keys)
+{
+IO_CODE();
+
+blk_wait_while_drained(blk);
+GRAPH_RDLOCK_GUARD();
+
+if (!blk_co_is_available(blk)) {
+return -ENOMEDIUM;
+}
+
+return bdrv_co_pr_read_keys(blk_bs(blk), generation, num_keys, keys);
+}
+
+static void coroutine_fn blk_aio_pr_read_keys_entry(void *opaque)
+{
+BlkPrInCB *acb = opaque;
+BlkPrInCo *prco = >prco;
+
+prco->ret = blk_aio_pr_do_read_keys(prco->blk, prco->generation,
+prco->num_keys, prco->keys);
+blk_pr_in_complete(acb);
+}
+
+BlockAIOCB *blk_aio_pr_read_keys(BlockBackend *blk, uint32_t *generation,
+ uint32_t num_keys, uint64_t *keys,
+ BlockCompletionFunc *cb, void *opaque)
+{
+IO_CODE();
+return blk_aio_pr_in(blk, generation, num_keys, NULL, keys,
+ blk_aio_pr_read_keys_entry, cb, opaque);
+}
+
+/* To be called between exactly one pair of blk_inc/dec_in_flight() */
+static int coroutine_fn
+blk_aio_pr_do_read_reservation(BlockBackend *blk, uint32_t *generation,
+   uint64_t *key, BlockPrType *type)
+{
+IO_CODE();
+
+blk_wait_while_drained(blk);
+GRAPH_RDLOCK_GUARD();
+
+if (!blk_co_is_available(blk)) {
+return -ENOMEDIUM;
+}
+
+return bdrv_co_pr_read_reservation(blk_bs(blk), generation, key, type);
+}
+
+static void coroutine_fn blk_aio_pr_read_reservation_entry(void *opaque)
+{
+BlkPrInCB *acb = opaque;
+BlkPrInCo *prco = >prco;
+
+prco->ret = blk_aio_pr_do_read_reservation(prco->blk, prco->generation,
+   prco->keys, prco->type);
+

[PATCH v3 00/11] Support persistent reservation operations

2024-05-17 Thread Changqi Lu

Hi,

Please ignore the v2 series. Please review the v3 series instead.
Thanks!

v2->v3:
In v2 Persist Through Power Loss(PTPL) is enable default.
In v3 PTPL is supported, which is passed as a parameter.

v1->v2:
- Add sg_persist --report-capabilities for SCSI protocol and enable
  oncs and rescap for NVMe protocol.
- Add persistent reservation capabilities constants and helper functions for
  SCSI and NVMe protocol.
- Add comments for necessary APIs.

v1:
- Add seven APIs about persistent reservation command for block layer.
  These APIs including reading keys, reading reservations, registering,
  reserving, releasing, clearing and preempting.
- Add the necessary pr-related operation APIs for both the
  SCSI protocol and NVMe protocol at the device layer.
- Add scsi driver at the driver layer to verify the functions.
Changqi Lu (11):
  block: add persistent reservation in/out api
  block/raw: add persistent reservation in/out driver
  scsi/constant: add persistent reservation in/out protocol constants
  scsi/util: add helper functions for persistent reservation types
conversion
  hw/scsi: add persistent reservation in/out api for scsi device
  block/nvme: add reservation command protocol constants
  hw/nvme: add helper functions for converting reservation types
  hw/nvme: enable ONCS reservations
  hw/nvme: enable namespace rescap function
  hw/nvme: add reservation protocal command
  block/iscsi: add persistent reservation in/out driver

 block/block-backend.c | 397 ++
 block/io.c| 163 +++
 block/iscsi.c | 443 ++
 block/raw-format.c|  56 
 hw/nvme/ctrl.c| 324 +-
 hw/nvme/ns.c  |   8 +
 hw/nvme/nvme.h|  44 +++
 hw/scsi/scsi-disk.c   | 352 
 include/block/block-common.h  |  40 +++
 include/block/block-io.h  |  20 ++
 include/block/block_int-common.h  |  84 ++
 include/block/nvme.h  |  99 +++
 include/scsi/constants.h  |  52 
 include/scsi/utils.h  |   8 +
 include/sysemu/block-backend-io.h |  24 ++
 scsi/utils.c  |  81 ++
 16 files changed, 2193 insertions(+), 2 deletions(-)

-- 
2.20.1

[PATCH v3 02/11] block/raw: add persistent reservation in/out driver

2024-05-17 Thread Changqi Lu

Add persistent reservation in/out operations for raw driver.
The following methods are implemented: bdrv_co_pr_read_keys,
bdrv_co_pr_read_reservation, bdrv_co_pr_register, bdrv_co_pr_reserve,
bdrv_co_pr_release, bdrv_co_pr_clear and bdrv_co_pr_preempt.

Signed-off-by: Changqi Lu 
Signed-off-by: zhenwei pi 
---
 block/raw-format.c | 56 ++
 1 file changed, 56 insertions(+)

diff --git a/block/raw-format.c b/block/raw-format.c
index ac7e8495f6..3746bc1bd3 100644
--- a/block/raw-format.c
+++ b/block/raw-format.c
@@ -454,6 +454,55 @@ raw_co_ioctl(BlockDriverState *bs, unsigned long int req, 
void *buf)
 return bdrv_co_ioctl(bs->file->bs, req, buf);
 }
 
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_read_keys(BlockDriverState *bs, uint32_t *generation,
+uint32_t num_keys, uint64_t *keys)
+{
+
+return bdrv_co_pr_read_keys(bs->file->bs, generation, num_keys, keys);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_read_reservation(BlockDriverState *bs, uint32_t *generation,
+   uint64_t *key, BlockPrType *type)
+{
+return bdrv_co_pr_read_reservation(bs->file->bs, generation, key, type);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_register(BlockDriverState *bs, uint64_t old_key,
+   uint64_t new_key, BlockPrType type,
+   bool ptpl, bool ignore_key)
+{
+return bdrv_co_pr_register(bs->file->bs, old_key, new_key,
+   type, ptpl, ignore_key);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_reserve(BlockDriverState *bs, uint64_t key, BlockPrType type)
+{
+return bdrv_co_pr_reserve(bs->file->bs, key, type);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_release(BlockDriverState *bs, uint64_t key, BlockPrType type)
+{
+return bdrv_co_pr_release(bs->file->bs, key, type);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_clear(BlockDriverState *bs, uint64_t key)
+{
+return bdrv_co_pr_clear(bs->file->bs, key);
+}
+
+static int coroutine_fn GRAPH_RDLOCK
+raw_co_pr_preempt(BlockDriverState *bs, uint64_t old_key,
+  uint64_t new_key, BlockPrType type, bool abort)
+{
+return bdrv_co_pr_preempt(bs->file->bs, old_key, new_key, type, abort);
+}
+
 static int GRAPH_RDLOCK raw_has_zero_init(BlockDriverState *bs)
 {
 return bdrv_has_zero_init(bs->file->bs);
@@ -672,6 +721,13 @@ BlockDriver bdrv_raw = {
 .strong_runtime_opts  = raw_strong_runtime_opts,
 .mutable_opts = mutable_opts,
 .bdrv_cancel_in_flight = raw_cancel_in_flight,
+.bdrv_co_pr_read_keys= raw_co_pr_read_keys,
+.bdrv_co_pr_read_reservation = raw_co_pr_read_reservation,
+.bdrv_co_pr_register = raw_co_pr_register,
+.bdrv_co_pr_reserve  = raw_co_pr_reserve,
+.bdrv_co_pr_release  = raw_co_pr_release,
+.bdrv_co_pr_clear= raw_co_pr_clear,
+.bdrv_co_pr_preempt  = raw_co_pr_preempt,
 };
 
 static void bdrv_raw_init(void)
-- 
2.20.1

Re: [PATCH v2 10/15] hw/riscv/riscv-iommu: add ATS support

2024-05-17 Thread Daniel Henrique Barboza


Hi Frank,


On 5/7/24 23:57, Frank Chang wrote:

Hi Daniel,

Daniel Henrique Barboza  於 2024年3月8日 週五 上午12:06寫道：


From: Tomasz Jeznach 

Add PCIe Address Translation Services (ATS) capabilities to the IOMMU.
This will add support for ATS translation requests in Fault/Event
queues, Page-request queue and IOATC invalidations.

Signed-off-by: Tomasz Jeznach 
Signed-off-by: Daniel Henrique Barboza 
---
  hw/riscv/riscv-iommu-bits.h |  43 ++-
  hw/riscv/riscv-iommu.c  | 107 +---
  hw/riscv/riscv-iommu.h  |   1 +
  hw/riscv/trace-events   |   3 +
  4 files changed, 145 insertions(+), 9 deletions(-)

diff --git a/hw/riscv/riscv-iommu-bits.h b/hw/riscv/riscv-iommu-bits.h
index 9d645d69ea..0994f5ce48 100644
--- a/hw/riscv/riscv-iommu-bits.h
+++ b/hw/riscv/riscv-iommu-bits.h
@@ -81,6 +81,7 @@ struct riscv_iommu_pq_record {
  #define RISCV_IOMMU_CAP_SV57X4  BIT_ULL(19)
  #define RISCV_IOMMU_CAP_MSI_FLATBIT_ULL(22)
  #define RISCV_IOMMU_CAP_MSI_MRIFBIT_ULL(23)
+#define RISCV_IOMMU_CAP_ATS BIT_ULL(25)
  #define RISCV_IOMMU_CAP_IGS GENMASK_ULL(29, 28)
  #define RISCV_IOMMU_CAP_PAS GENMASK_ULL(37, 32)
  #define RISCV_IOMMU_CAP_PD8 BIT_ULL(38)
@@ -201,6 +202,7 @@ struct riscv_iommu_dc {

  /* Translation control fields */
  #define RISCV_IOMMU_DC_TC_V BIT_ULL(0)
+#define RISCV_IOMMU_DC_TC_EN_ATSBIT_ULL(1)
  #define RISCV_IOMMU_DC_TC_DTF   BIT_ULL(4)
  #define RISCV_IOMMU_DC_TC_PDTV  BIT_ULL(5)
  #define RISCV_IOMMU_DC_TC_PRPR  BIT_ULL(6)
@@ -259,6 +261,20 @@ struct riscv_iommu_command {
  #define RISCV_IOMMU_CMD_IODIR_DVBIT_ULL(33)
  #define RISCV_IOMMU_CMD_IODIR_DID   GENMASK_ULL(63, 40)

+/* 3.1.4 I/O MMU PCIe ATS */
+#define RISCV_IOMMU_CMD_ATS_OPCODE  4
+#define RISCV_IOMMU_CMD_ATS_FUNC_INVAL  0
+#define RISCV_IOMMU_CMD_ATS_FUNC_PRGR   1
+#define RISCV_IOMMU_CMD_ATS_PID GENMASK_ULL(31, 12)
+#define RISCV_IOMMU_CMD_ATS_PV  BIT_ULL(32)
+#define RISCV_IOMMU_CMD_ATS_DSV BIT_ULL(33)
+#define RISCV_IOMMU_CMD_ATS_RID GENMASK_ULL(55, 40)
+#define RISCV_IOMMU_CMD_ATS_DSEGGENMASK_ULL(63, 56)
+/* dword1 is the ATS payload, two different payload types for INVAL and PRGR */
+
+/* ATS.PRGR payload */
+#define RISCV_IOMMU_CMD_ATS_PRGR_RESP_CODE  GENMASK_ULL(47, 44)
+
  enum riscv_iommu_dc_fsc_atp_modes {
  RISCV_IOMMU_DC_FSC_MODE_BARE = 0,
  RISCV_IOMMU_DC_FSC_IOSATP_MODE_SV32 = 8,
@@ -322,7 +338,32 @@ enum riscv_iommu_fq_ttypes {
  RISCV_IOMMU_FQ_TTYPE_TADDR_INST_FETCH = 5,
  RISCV_IOMMU_FQ_TTYPE_TADDR_RD = 6,
  RISCV_IOMMU_FQ_TTYPE_TADDR_WR = 7,
-RISCV_IOMMU_FW_TTYPE_PCIE_MSG_REQ = 8,
+RISCV_IOMMU_FQ_TTYPE_PCIE_ATS_REQ = 8,
+RISCV_IOMMU_FW_TTYPE_PCIE_MSG_REQ = 9,
+};
+
+/* Header fields */
+#define RISCV_IOMMU_PREQ_HDR_PIDGENMASK_ULL(31, 12)
+#define RISCV_IOMMU_PREQ_HDR_PV BIT_ULL(32)
+#define RISCV_IOMMU_PREQ_HDR_PRIV   BIT_ULL(33)
+#define RISCV_IOMMU_PREQ_HDR_EXEC   BIT_ULL(34)
+#define RISCV_IOMMU_PREQ_HDR_DIDGENMASK_ULL(63, 40)
+
+/* Payload fields */
+#define RISCV_IOMMU_PREQ_PAYLOAD_R  BIT_ULL(0)
+#define RISCV_IOMMU_PREQ_PAYLOAD_W  BIT_ULL(1)
+#define RISCV_IOMMU_PREQ_PAYLOAD_L  BIT_ULL(2)
+#define RISCV_IOMMU_PREQ_PAYLOAD_M  GENMASK_ULL(2, 0)
+#define RISCV_IOMMU_PREQ_PRG_INDEX  GENMASK_ULL(11, 3)
+#define RISCV_IOMMU_PREQ_UADDR  GENMASK_ULL(63, 12)
+
+
+/*
+ * struct riscv_iommu_msi_pte - MSI Page Table Entry
+ */
+struct riscv_iommu_msi_pte {
+  uint64_t pte;
+  uint64_t mrif_info;
  };

  /* Fields on pte */
diff --git a/hw/riscv/riscv-iommu.c b/hw/riscv/riscv-iommu.c
index 03a610fa75..7af5929b10 100644
--- a/hw/riscv/riscv-iommu.c
+++ b/hw/riscv/riscv-iommu.c
@@ -576,7 +576,7 @@ static int riscv_iommu_ctx_fetch(RISCVIOMMUState *s, 
RISCVIOMMUContext *ctx)
  RISCV_IOMMU_DC_IOHGATP_MODE_BARE);
  ctx->satp = set_field(0, RISCV_IOMMU_ATP_MODE_FIELD,
  RISCV_IOMMU_DC_FSC_MODE_BARE);
-ctx->tc = RISCV_IOMMU_DC_TC_V;
+ctx->tc = RISCV_IOMMU_DC_TC_EN_ATS | RISCV_IOMMU_DC_TC_V;


We should OR RISCV_IOMMU_DC_TC_EN_ATS only when IOMMU has ATS capability.
(i.e. s->enable_ats == true).


  ctx->ta = 0;
  ctx->msiptp = 0;
  return 0;
@@ -1021,6 +1021,18 @@ static int riscv_iommu_translate(RISCVIOMMUState *s, 
RISCVIOMMUContext *ctx,
  enable_pri = (iotlb->perm == IOMMU_NONE) && (ctx->tc & BIT_ULL(32));
  enable_pasid = (ctx->tc & RISCV_IOMMU_DC_TC_PDTV);

+/* Check for ATS request. */
+if (iotlb->perm == IOMMU_NONE) {
+/* Check if ATS is disabled. */
+if (!(ctx->tc & RISCV_IOMMU_DC_TC_EN_ATS)) {
+enable_pri = false;
+fault = RISCV_IOMMU_FQ_CAUSE_TTYPE_BLOCKED;
+goto done;
+}
+trace_riscv_iommu_ats(s->parent_obj.id,

RE: [PATCH 00/16] VFIO: misc cleanups part2

2024-05-17 Thread Duan, Zhenzhong

Hi Cédric,

>-Original Message-
>From: Cédric Le Goater 
>Sent: Friday, May 17, 2024 12:48 AM
>To: Duan, Zhenzhong ; qemu-
>de...@nongnu.org
>Cc: alex.william...@redhat.com; eric.au...@redhat.com; Peng, Chao P
>
>Subject: Re: [PATCH 00/16] VFIO: misc cleanups part2
>
>Hello Zhenzhong,
>
>On 5/15/24 10:20, Zhenzhong Duan wrote:
>> Hi
>>
>> This is the last round of cleanup series to change functions in hw/vfio/
>> to return bool when the error is passed through errp parameter.
>>
>> The first round is at https://lists.gnu.org/archive/html/qemu-devel/2024-
>05/msg01147.html
>>
>> I see Cédric is also working on some migration stuff cleanup,
>> so didn't touch migration.c, but all other files in hw/vfio/ are cleanup now.
>>
>> Patch1 is a fix patch, all others are cleanup patches.
>>
>> Test done on x86 platform:
>> vfio device hotplug/unplug with different backend
>> reboot
>>
>> This series is rebased to https://github.com/legoater/qemu/tree/vfio-next
>
>I queued part 1 in vfio-next with other changes. part 2 is in vfio-9.1
>for now and should reach vfio-next after reviews next week.
>
>Then, we have to work on your v5 [1] which should have all my attention
>again after the next vfio PR. You, Joao and Eric have followups series
>that need a resync on top of v5, possibly others [2] and [3], not sent
>AFAICT. Anyhow, we will need inputs from these people and IOMMU
>stakeholders/maintainers.

Thanks for sharing the plan.

+Joao, Eric, Michael, Jason, Nicolin, Clement for their awareness.

On my side, I have rebased nesting series on top of v5[1],
the newest patches at 
https://github.com/yiliu1765/qemu/commits/zhenzhong/iommufd_nesting_rfcv2/
is under internal review, FYI.

Thanks
Zhenzhong

>
>Thanks,
>
>C.
>
>[1] [PATCH v5 00/19] Add a host IOMMU device abstraction to check with
>vIOMMU
> https://lore.kernel.org/qemu-devel/20240507092043.1172717-1-
>zhenzhong.d...@intel.com/
>
>[2] [PATCH ats_vtd v2 00/25] ATS support for VT-d
> https://lore.kernel.org/all/20240515071057.33990-1-clement.mathieu--
>d...@eviden.com/
>
>[3] Add Tegra241 (Grace) CMDQV Support
> https://lore.kernel.org/all/cover.1712978212.git.nicol...@nvidia.com/
> https://github.com/nicolinc/qemu/commits/wip/iommufd_vcmdq/
>
>
>
>>
>> Thanks
>> Zhenzhong
>>
>> Zhenzhong Duan (16):
>>vfio/display: Fix error path in call site of ramfb_setup()
>>vfio/display: Make vfio_display_*() return bool
>>vfio/helpers: Use g_autofree in hw/vfio/helpers.c
>>vfio/helpers: Make vfio_set_irq_signaling() return bool
>>vfio/helpers: Make vfio_device_get_name() return bool
>>vfio/platform: Make vfio_populate_device() and vfio_base_device_init()
>>  return bool
>>vfio/ccw: Make vfio_ccw_get_region() return a bool
>>vfio/pci: Make vfio_intx_enable_kvm() return a bool
>>vfio/pci: Make vfio_pci_relocate_msix() and vfio_msix_early_setup()
>>  return a bool
>>vfio/pci: Make vfio_populate_device() return a bool
>>vfio/pci: Make vfio_intx_enable() return bool
>>vfio/pci: Make vfio_populate_vga() return bool
>>vfio/pci: Make capability related functions return bool
>>vfio/pci: Use g_autofree for vfio_region_info pointer
>>vfio/pci-quirks: Make vfio_pci_igd_opregion_init() return bool
>>vfio/pci-quirks: Make vfio_add_*_cap() return bool
>>
>>   hw/vfio/pci.h |  12 +-
>>   include/hw/vfio/vfio-common.h |   6 +-
>>   hw/vfio/ap.c  |  10 +-
>>   hw/vfio/ccw.c |  25 ++--
>>   hw/vfio/display.c |  22 ++--
>>   hw/vfio/helpers.c |  33 ++---
>>   hw/vfio/igd.c |   5 +-
>>   hw/vfio/pci-quirks.c  |  50 
>>   hw/vfio/pci.c | 227 --
>>   hw/vfio/platform.c|  61 -
>>   10 files changed, 213 insertions(+), 238 deletions(-)
>>

RE: [PATCH v3 12/16] aspeed/soc: Add AST2700 support

2024-05-17 Thread Jamin Lin

Hi Cerdric,

> On 4/19/24 09:58, Jamin Lin wrote:
> > Hi Cedric,
> >> On 4/16/24 11:18, Jamin Lin wrote:
> >>> Initial definitions for a simple machine using an AST2700 SOC
> >>> (Cortex-a35
> >> CPU).
> >>>
> >>> AST2700 SOC and its interrupt controller are too complex to handle
> >>> in the common Aspeed SoC framework. We introduce a new ast2700 class
> >>> with instance_init and realize handlers.
> >>>
> >>> AST2700 is a 64 bits quad core cpus and support 8 watchdog.
> >>> Update maximum ASPEED_CPUS_NUM to 4 and ASPEED_WDTS_NUM to
> 8.
> >>> In addition, update AspeedSocState to support scuio, sli, sliio and intc.
> >>>
> >>> Add TYPE_ASPEED27X0_SOC machine type.
> >>>
> >>> The SDMC controller is unlocked at SPL stage.
> >>> At present, only supports to emulate booting start from u-boot stage.
> >>> Set SDMC controller unlocked by default.
> >>>
> >>> In INTC, each interrupt of INT 128 to INT 136 combines 32 interrupts.
> >>> It connect GICINT IRQ GPIO-OUTPUT pins to GIC device with irq 128 to
> 136.
> >>> And, if a device irq is 128 to 136, its irq GPIO-OUTPUT pin is
> >>> connected to GICINT or-gates instead of GIC device.
> >>>
> >>> Signed-off-by: Troy Lee 
> >>> Signed-off-by: Jamin Lin 
> >>
> >> Before I forget, please see a little comment below regarding user
> >> creatable devices.
> >>
> >> The model looks fine. The interrupt controller part is more complex
> >> than the previous SoCs so I will come back to it later when I have more
> time.
> > Thanks for your kindly support.
> >>> ---
> >>>hw/arm/aspeed_ast27x0.c | 554
> >> 
> >>>hw/arm/meson.build  |   1 +
> >>>include/hw/arm/aspeed_soc.h |  26 +-
> >>>3 files changed, 579 insertions(+), 2 deletions(-)
> >>>create mode 100644 hw/arm/aspeed_ast27x0.c
> >>>
> >>> diff --git a/hw/arm/aspeed_ast27x0.c b/hw/arm/aspeed_ast27x0.c new
> >>> file mode 100644 index 00..754c963230
> >>> --- /dev/null
> >>> +++ b/hw/arm/aspeed_ast27x0.c
> >>> @@ -0,0 +1,554 @@
> >>> +/*
> >>> + * ASPEED SoC 27x0 family
> >>> + *
> >>> + * Copyright (C) 2024 ASPEED Technology Inc.
> >>> + *
> >>> + * This code is licensed under the GPL version 2 or later.  See
> >>> + * the COPYING file in the top-level directory.
> >>> + *
> >>> + * Implementation extracted from the AST2600 and adapted for AST27x0.
> >>> + */
> >>> +
> >>> +#include "qemu/osdep.h"
> >>> +#include "qapi/error.h"
> >>> +#include "hw/misc/unimp.h"
> >>> +#include "hw/arm/aspeed_soc.h"
> >>> +#include "qemu/module.h"
> >>> +#include "qemu/error-report.h"
> >>> +#include "hw/i2c/aspeed_i2c.h"
> >>> +#include "net/net.h"
> >>> +#include "sysemu/sysemu.h"
> >>> +#include "hw/intc/arm_gicv3.h"
> >>> +#include "qapi/qmp/qlist.h"
> >>> +
> >>> +static const hwaddr aspeed_soc_ast2700_memmap[] = {
> >>> +[ASPEED_DEV_SPI_BOOT]  =  0x4,
> >>> +[ASPEED_DEV_SRAM]  =  0x1000,
> >>> +[ASPEED_DEV_SDMC]  =  0x12C0,
> >>> +[ASPEED_DEV_SCU]   =  0x12C02000,
> >>> +[ASPEED_DEV_SCUIO] =  0x14C02000,
> >>> +[ASPEED_DEV_UART0] =  0X14C33000,
> >>> +[ASPEED_DEV_UART1] =  0X14C33100,
> >>> +[ASPEED_DEV_UART2] =  0X14C33200,
> >>> +[ASPEED_DEV_UART3] =  0X14C33300,
> >>> +[ASPEED_DEV_UART4] =  0X12C1A000,
> >>> +[ASPEED_DEV_UART5] =  0X14C33400,
> >>> +[ASPEED_DEV_UART6] =  0X14C33500,
> >>> +[ASPEED_DEV_UART7] =  0X14C33600,
> >>> +[ASPEED_DEV_UART8] =  0X14C33700,
> >>> +[ASPEED_DEV_UART9] =  0X14C33800,
> >>> +[ASPEED_DEV_UART10]=  0X14C33900,
> >>> +[ASPEED_DEV_UART11]=  0X14C33A00,
> >>> +[ASPEED_DEV_UART12]=  0X14C33B00,
> >>> +[ASPEED_DEV_WDT]   =  0x14C37000,
> >>> +[ASPEED_DEV_VUART] =  0X14C3,
> >>> +[ASPEED_DEV_FMC]   =  0x1400,
> >>> +[ASPEED_DEV_SPI0]  =  0x1401,
> >>> +[ASPEED_DEV_SPI1]  =  0x1402,
> >>> +[ASPEED_DEV_SPI2]  =  0x1403,
> >>> +[ASPEED_DEV_SDRAM] =  0x4,
> >>> +[ASPEED_DEV_MII1]  =  0x1404,
> >>> +[ASPEED_DEV_MII2]  =  0x14040008,
> >>> +[ASPEED_DEV_MII3]  =  0x14040010,
> >>> +[ASPEED_DEV_ETH1]  =  0x1405,
> >>> +[ASPEED_DEV_ETH2]  =  0x1406,
> >>> +[ASPEED_DEV_ETH3]  =  0x1407,
> >>> +[ASPEED_DEV_EMMC]  =  0x1209,
> >>> +[ASPEED_DEV_INTC]  =  0x1210,
> >>> +[ASPEED_DEV_SLI]   =  0x12C17000,
> >>> +[ASPEED_DEV_SLIIO] =  0x14C1E000,
> >>> +[ASPEED_GIC_DIST]  =  0x1220,
> >>> +[ASPEED_GIC_REDIST]=  0x1228,
> >>> +};
> >>> +
> >>> +#define AST2700_MAX_IRQ 288
> >>> +
> >>> +/* Shared Peripheral Interrupt values below are offset by -32 from
> >>> +datasheet */ static const int aspeed_soc_ast2700_irqmap[] = {
> >>> +[ASPEED_DEV_UART0] = 132,
> >>> +[ASPEED_DEV_UART1] = 132,
> >>> +[ASPEED_DEV_UART2] = 132,
> >>> +[ASPEED_DEV_UART3] = 132,

Re: [PATCH v2 3/3] crypto: Allow building with GnuTLS but without Libtasn1

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 02, 2024 at 11:56:42AM +0200, Philippe Mathieu-Daudé wrote:
> We only use Libtasn1 in unit tests. As noted in commit d47b83b118
> ("tests: add migration tests of TLS with x509 credentials"), having
> GnuTLS without Libtasn1 is a valid configuration, so do not require
> Libtasn1, to avoid:
> 
>   Dependency gnutls found: YES 3.7.1 (cached)
>   Run-time dependency libtasn1 found: NO (tried pkgconfig)
> 
>   ../meson.build:1914:10: ERROR: Dependency "libtasn1" not found, tried 
> pkgconfig

Did you actually try to build without libtasn1 present ?

If I remove /usr/lib64/pkgconfig/libtasn1.pc, then the prior
check for 'gnutls' itself will fail, as libtasn1 is declared
to be a dep of gnutls in its pkg-config file, regardless of
what QEMU askes for:

$ pkg-config --cflags --libs gnutls
Package libtasn1 was not found in the pkg-config search path.
Perhaps you should add the directory containing `libtasn1.pc'
to the PKG_CONFIG_PATH environment variable
Package 'libtasn1', required by 'gnutls', not found

I'm still willing to merge this, because from QEMU's POV,
libtasn1 isn't required.

> 
> Fixes: ba7ed407e6 ("configure, meson: convert libtasn1 detection to meson")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  meson.build | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/meson.build b/meson.build
> index 5db2dbc12e..837a2bdb56 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -1912,6 +1912,7 @@ endif
>  tasn1 = not_found
>  if gnutls.found()
>tasn1 = dependency('libtasn1',
> + required: false,
>   method: 'pkg-config')
>  endif
>  keyutils = not_found
> -- 
> 2.41.0
> 

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 3/3] crypto: Allow building with GnuTLS but without Libtasn1

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 02, 2024 at 11:56:42AM +0200, Philippe Mathieu-Daudé wrote:
> We only use Libtasn1 in unit tests. As noted in commit d47b83b118
> ("tests: add migration tests of TLS with x509 credentials"), having
> GnuTLS without Libtasn1 is a valid configuration, so do not require
> Libtasn1, to avoid:
> 
>   Dependency gnutls found: YES 3.7.1 (cached)
>   Run-time dependency libtasn1 found: NO (tried pkgconfig)
> 
>   ../meson.build:1914:10: ERROR: Dependency "libtasn1" not found, tried 
> pkgconfig
> 
> Fixes: ba7ed407e6 ("configure, meson: convert libtasn1 detection to meson")
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  meson.build | 1 +
>  1 file changed, 1 insertion(+)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 2/3] crypto: Restrict pkix_asn1_tab[] to crypto-tls-x509-helpers.c

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 02, 2024 at 11:56:41AM +0200, Philippe Mathieu-Daudé wrote:
> pkix_asn1_tab[] is only accessed by crypto-tls-x509-helpers.c,
> rename pkix_asn1_tab.c as pkix_asn1_tab.c.inc and include it once.
> 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/unit/crypto-tls-x509-helpers.h| 3 ---
>  tests/unit/crypto-tls-x509-helpers.c| 6 +-
>  tests/unit/{pkix_asn1_tab.c => pkix_asn1_tab.c.inc} | 5 +
>  tests/qtest/meson.build | 3 +--
>  tests/unit/meson.build  | 6 +++---
>  5 files changed, 10 insertions(+), 13 deletions(-)
>  rename tests/unit/{pkix_asn1_tab.c => pkix_asn1_tab.c.inc} (99%)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v2 1/3] crypto: Remove 'crypto-tls-x509-helpers.h' from crypto-tls-psk-helpers.c

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 02, 2024 at 11:56:40AM +0200, Philippe Mathieu-Daudé wrote:
> crypto-tls-psk-helpers.c doesn't access the declarations
> of "crypto-tls-x509-helpers.h", remove the include line
> to avoid when building with GNUTLS but without Libtasn1:
> 
>   In file included from tests/unit/crypto-tls-psk-helpers.c:23:
>   tests/unit/crypto-tls-x509-helpers.h:26:10: fatal error:
>   libtasn1.h: No such file or directory
>  26 | #include 
> |  ^~~~
>   compilation terminated.
> 
> Fixes: e1a6dc91dd ("crypto: Implement TLS Pre-Shared Keys (PSK).")
> Suggested-by: Daniel P. Berrangé 
> Signed-off-by: Philippe Mathieu-Daudé 
> ---
>  tests/unit/crypto-tls-psk-helpers.c | 1 -
>  1 file changed, 1 deletion(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v7 36/61] target/ppc/mmu_common.c: Remove local name for a constant

2024-05-17 Thread BALATON Zoltan


On Fri, 17 May 2024, Nicholas Piggin wrote:

On Mon May 13, 2024 at 9:28 AM AEST, BALATON Zoltan wrote:

The mmask local variable is a less descriptive local name for a
constant. Drop it and use the constant directly in the two places it
is needed.


Wow, lots more. I might take up to patch 34ish for first PR.


Yes I think that might be a good idea, just what I said in previous reply, 
as I have some more already even compared to this version. There's a lot 
to clean up in this series and I did not even attempt to fix the 6xx cases 
that I won't do because I don't know it so just tried to separate these 
and do some trivial clean up/improvement.


Regards,
BALATON Zoltan


Thanks,
Nick



Signed-off-by: BALATON Zoltan 
---
 target/ppc/mmu_common.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/target/ppc/mmu_common.c b/target/ppc/mmu_common.c
index 9e0bfbda67..5d0090014a 100644
--- a/target/ppc/mmu_common.c
+++ b/target/ppc/mmu_common.c
@@ -98,7 +98,7 @@ static int ppc6xx_tlb_pte_check(mmu_ctx_t *ctx, target_ulong 
pte0,
 target_ulong pte1, int h,
 MMUAccessType access_type)
 {
-target_ulong ptem, mmask;
+target_ulong ptem;
 int ret, pteh, ptev, pp;

 ret = -1;
@@ -108,12 +108,11 @@ static int ppc6xx_tlb_pte_check(mmu_ctx_t *ctx, 
target_ulong pte0,
 if (ptev && h == pteh) {
 /* Check vsid & api */
 ptem = pte0 & PTE_PTEM_MASK;
-mmask = PTE_CHECK_MASK;
 pp = pte1 & 0x0003;
 if (ptem == ctx->ptem) {
 if (ctx->raddr != (hwaddr)-1ULL) {
 /* all matches should have equal RPN, WIMG & PP */
-if ((ctx->raddr & mmask) != (pte1 & mmask)) {
+if ((ctx->raddr & PTE_CHECK_MASK) != (pte1 & PTE_CHECK_MASK)) {
 qemu_log_mask(CPU_LOG_MMU, "Bad RPN/WIMG/PP\n");
 return -3;
 }

Re: [PATCH v7 35/61] target/ppc: Remove pp_check() and reuse ppc_hash32_pp_prot()

2024-05-17 Thread BALATON Zoltan


On Fri, 17 May 2024, Nicholas Piggin wrote:

On Mon May 13, 2024 at 9:28 AM AEST, BALATON Zoltan wrote:

The ppc_hash32_pp_prot() function in mmu-hash32.c is the same as
pp_check() in mmu_common.c, merge these to remove duplicated code.
Define the common function as static lnline otherwise exporting the
function from mmu-hash32.c would stop the compiler inlining it which
results in slightly lower performance.



It's already hard to review patches that move code around, it's better
to keep the changes before/after the move unless really necessary.


I could try to split this further but the series was already quite long 
and this is not too complex and also there's git diff --color-moved so I 
though this could be in in one patch.



For mmu_common.c hunks,

Reviewed-by: Nicholas Piggin 


Hmm, I can't apply Rb to hunks so I think I have to ignore it for now and 
wait until you send Rb for whole patch.


Just to make it simpler could you please send a pull request for the 
patches that are already reviewed at the beginning of the series to reduce 
the number of patches I need to resend? I've already added some more and 
still have some plans to continue so moving the patches that are OK out of 
the way could help. Then I could just resend the patches starting from the 
first that's not yet reviewed. Thank you for taking time to review these.


Regards,
BALATON Zoltan


Thanks,
Nick


Signed-off-by: BALATON Zoltan 
---
 target/ppc/mmu-hash32.c | 45 -
 target/ppc/mmu-hash32.h | 36 +
 target/ppc/mmu_common.c | 44 ++--
 3 files changed, 38 insertions(+), 87 deletions(-)

diff --git a/target/ppc/mmu-hash32.c b/target/ppc/mmu-hash32.c
index 1e8f1df0f0..d5f2057eb1 100644
--- a/target/ppc/mmu-hash32.c
+++ b/target/ppc/mmu-hash32.c
@@ -37,51 +37,6 @@
 #  define LOG_BATS(...) do { } while (0)
 #endif

-static int ppc_hash32_pp_prot(int key, int pp, int nx)
-{
-int prot;
-
-if (key == 0) {
-switch (pp) {
-case 0x0:
-case 0x1:
-case 0x2:
-prot = PAGE_READ | PAGE_WRITE;
-break;
-
-case 0x3:
-prot = PAGE_READ;
-break;
-
-default:
-abort();
-}
-} else {
-switch (pp) {
-case 0x0:
-prot = 0;
-break;
-
-case 0x1:
-case 0x3:
-prot = PAGE_READ;
-break;
-
-case 0x2:
-prot = PAGE_READ | PAGE_WRITE;
-break;
-
-default:
-abort();
-}
-}
-if (nx == 0) {
-prot |= PAGE_EXEC;
-}
-
-return prot;
-}
-
 static int ppc_hash32_pte_prot(int mmu_idx,
target_ulong sr, ppc_hash_pte32_t pte)
 {
diff --git a/target/ppc/mmu-hash32.h b/target/ppc/mmu-hash32.h
index 7119a63d97..bf99161858 100644
--- a/target/ppc/mmu-hash32.h
+++ b/target/ppc/mmu-hash32.h
@@ -102,6 +102,42 @@ static inline void ppc_hash32_store_hpte1(PowerPCCPU *cpu,
 stl_phys(CPU(cpu)->as, base + pte_offset + HASH_PTE_SIZE_32 / 2, pte1);
 }

+static inline int ppc_hash32_pp_prot(bool key, int pp, bool nx)
+{
+int prot;
+
+if (key) {
+switch (pp) {
+case 0x0:
+prot = 0;
+break;
+case 0x1:
+case 0x3:
+prot = PAGE_READ;
+break;
+case 0x2:
+prot = PAGE_READ | PAGE_WRITE;
+break;
+default:
+g_assert_not_reached();
+}
+} else {
+switch (pp) {
+case 0x0:
+case 0x1:
+case 0x2:
+prot = PAGE_READ | PAGE_WRITE;
+break;
+case 0x3:
+prot = PAGE_READ;
+break;
+default:
+g_assert_not_reached();
+}
+}
+return nx ? prot : prot | PAGE_EXEC;
+}
+
 typedef struct {
 uint32_t pte0, pte1;
 } ppc_hash_pte32_t;
diff --git a/target/ppc/mmu_common.c b/target/ppc/mmu_common.c
index e1462a25dd..9e0bfbda67 100644
--- a/target/ppc/mmu_common.c
+++ b/target/ppc/mmu_common.c
@@ -77,44 +77,6 @@ void ppc_store_sdr1(CPUPPCState *env, target_ulong value)
 /*/
 /* PowerPC MMU emulation */

-static int pp_check(int key, int pp, int nx)
-{
-int access;
-
-/* Compute access rights */
-access = 0;
-if (key == 0) {
-switch (pp) {
-case 0x0:
-case 0x1:
-case 0x2:
-access |= PAGE_WRITE;
-/* fall through */
-case 0x3:
-access |= PAGE_READ;
-break;
-}
-} else {
-switch (pp) {
-case 0x0:
-access = 0;
-break;
-case 0x1:
-case 0x3:
-access = PAGE_READ;
-break;
-case 0x2:
-access = PAGE_READ | PAGE_WRITE;
-break;
-}
-

Re: [PATCH 0/2] Zynq 7000 SoC improvements

2024-05-17 Thread Sebastian Huber


Hello,

is the mailing list the right place for contributions like this?

On 07.05.24 15:03, Sebastian Huber wrote:

Add support for the cache controller and up to two Cortex-A9 MPCore.

Sebastian Huber (2):
   hw/arm/xilinx_zynq: Add cache controller
   hw/arm/xilinx_zynq: Support up to two CPU cores

  hw/arm/xilinx_zynq.c | 43 ---
  1 file changed, 28 insertions(+), 15 deletions(-)



--
embedded brains GmbH & Co. KG
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/

Re: [PATCH] hw/intc/s390_flic: Fix crash that occurs when saving the machine state

2024-05-17 Thread Marc Hartmayer

On Fri, May 17, 2024 at 08:15 AM +0200, Thomas Huth  wrote:
> adapter_info_so_needed() treats its "opaque" parameter as a S390FLICState,
> but the function belongs to a VMStateDescription that is attached to a
> TYPE_VIRTIO_CCW_BUS device. This is currently causing a crash when the
> user tries to save or migrate the VM state. Fix it by using s390_get_flic()
> to get the correct device here instead.
>
> Reported-by: Marc Hartmayer 
> Fixes: 9d1b0f5bf5 ("s390_flic: add migration-enabled property")
> Signed-off-by: Thomas Huth 
> ---
>  hw/intc/s390_flic.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
> index 7f93080087..6771645699 100644
> --- a/hw/intc/s390_flic.c
> +++ b/hw/intc/s390_flic.c
> @@ -459,7 +459,7 @@ type_init(qemu_s390_flic_register_types)
>  
>  static bool adapter_info_so_needed(void *opaque)
>  {
> -S390FLICState *fs = S390_FLIC_COMMON(opaque);
> +S390FLICState *fs = s390_get_flic();
>  
>  return fs->migration_enabled;
>  }
> -- 
> 2.45.0
>

Tested-by: Marc Hartmayer 

Thanks.

-- 
Kind regards / Beste Grüße
   Marc Hartmayer

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Wolfgang Wendt
Geschäftsführung: David Faller
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

Re: [PATCH] gitlab-ci: Replace Docker with Kaniko

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 16, 2024 at 07:24:04PM +0100, Daniel P. Berrangé wrote:
> On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote:
> > Enables caching from the qemu-project repository.
> > 
> > Uses a dedicated "$NAME-cache" tag for caching, to address limitations.
> > See issue "when using --cache=true, kaniko fail to push cache layer [...]":
> > https://github.com/GoogleContainerTools/kaniko/issues/1459
> 
> After investigating, this is a result of a different design approach
> for caching in kaniko.
> 
> In docker, it can leverage any existing image as a cache source,
> reusing individual layers that were present. IOW, there's no
> difference between a cache and a final image, they're one and the
> same thing
> 
> In kaniko, the cache is a distinct object type. IIUC, it is not
> populated with the individual layers, instead it has a custom
> format for storing the cached content. Therefore the concept of
> storing the cache at the same location as the final image, is
> completely inappropriate - you can't store two completely different
> kinds of content at the same place.
> 
> That is also why you can't just "git pull" the fetch the cache
> image(s) beforehand, and also why it doesn't look like you can
> use multiple cache sources with kaniko.
> 
> None of this is inherantly a bad thing. except when it comes
> to data storage. By using Kaniko we would, at minimum, doubling
> the amount of data storage we consume in the gitlab registry.

Double is actually just the initial case. The cache is storing layers
using docker tags, whose name appears based on a hash of the "RUN"
command.

IOW, the first time we build a container we have double the usage.
When a dockerfile is updated changing a 'RUN' command, we now have
triple the storage usage for cache. Update the RUN command again,
and we now have quadruple the storage. etc.

Kaniko does not appear to purge cache entries itself, and will rely
on something else to do the cache purging.

GitLab has support for purging old docker tags, but I'm not an
admin on the QEMU project namespace, so can't tell if it can be
enabled or not ? Many older projects have this permanently disabled
due to historical compat issues in gitlab after they introduced the
feature.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v3 12/16] aspeed/soc: Add AST2700 support

2024-05-17 Thread Cédric Le Goater


On 4/19/24 09:58, Jamin Lin wrote:

Hi Cedric,

On 4/16/24 11:18, Jamin Lin wrote:

Initial definitions for a simple machine using an AST2700 SOC (Cortex-a35

CPU).


AST2700 SOC and its interrupt controller are too complex to handle in
the common Aspeed SoC framework. We introduce a new ast2700 class with
instance_init and realize handlers.

AST2700 is a 64 bits quad core cpus and support 8 watchdog.
Update maximum ASPEED_CPUS_NUM to 4 and ASPEED_WDTS_NUM to 8.
In addition, update AspeedSocState to support scuio, sli, sliio and intc.

Add TYPE_ASPEED27X0_SOC machine type.

The SDMC controller is unlocked at SPL stage.
At present, only supports to emulate booting start from u-boot stage.
Set SDMC controller unlocked by default.

In INTC, each interrupt of INT 128 to INT 136 combines 32 interrupts.
It connect GICINT IRQ GPIO-OUTPUT pins to GIC device with irq 128 to 136.
And, if a device irq is 128 to 136, its irq GPIO-OUTPUT pin is
connected to GICINT or-gates instead of GIC device.

Signed-off-by: Troy Lee 
Signed-off-by: Jamin Lin 


Before I forget, please see a little comment below regarding user creatable
devices.

The model looks fine. The interrupt controller part is more complex than the
previous SoCs so I will come back to it later when I have more time.

Thanks for your kindly support.

---
   hw/arm/aspeed_ast27x0.c | 554



   hw/arm/meson.build  |   1 +
   include/hw/arm/aspeed_soc.h |  26 +-
   3 files changed, 579 insertions(+), 2 deletions(-)
   create mode 100644 hw/arm/aspeed_ast27x0.c

diff --git a/hw/arm/aspeed_ast27x0.c b/hw/arm/aspeed_ast27x0.c new
file mode 100644 index 00..754c963230
--- /dev/null
+++ b/hw/arm/aspeed_ast27x0.c
@@ -0,0 +1,554 @@
+/*
+ * ASPEED SoC 27x0 family
+ *
+ * Copyright (C) 2024 ASPEED Technology Inc.
+ *
+ * This code is licensed under the GPL version 2 or later.  See
+ * the COPYING file in the top-level directory.
+ *
+ * Implementation extracted from the AST2600 and adapted for AST27x0.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "hw/misc/unimp.h"
+#include "hw/arm/aspeed_soc.h"
+#include "qemu/module.h"
+#include "qemu/error-report.h"
+#include "hw/i2c/aspeed_i2c.h"
+#include "net/net.h"
+#include "sysemu/sysemu.h"
+#include "hw/intc/arm_gicv3.h"
+#include "qapi/qmp/qlist.h"
+
+static const hwaddr aspeed_soc_ast2700_memmap[] = {
+[ASPEED_DEV_SPI_BOOT]  =  0x4,
+[ASPEED_DEV_SRAM]  =  0x1000,
+[ASPEED_DEV_SDMC]  =  0x12C0,
+[ASPEED_DEV_SCU]   =  0x12C02000,
+[ASPEED_DEV_SCUIO] =  0x14C02000,
+[ASPEED_DEV_UART0] =  0X14C33000,
+[ASPEED_DEV_UART1] =  0X14C33100,
+[ASPEED_DEV_UART2] =  0X14C33200,
+[ASPEED_DEV_UART3] =  0X14C33300,
+[ASPEED_DEV_UART4] =  0X12C1A000,
+[ASPEED_DEV_UART5] =  0X14C33400,
+[ASPEED_DEV_UART6] =  0X14C33500,
+[ASPEED_DEV_UART7] =  0X14C33600,
+[ASPEED_DEV_UART8] =  0X14C33700,
+[ASPEED_DEV_UART9] =  0X14C33800,
+[ASPEED_DEV_UART10]=  0X14C33900,
+[ASPEED_DEV_UART11]=  0X14C33A00,
+[ASPEED_DEV_UART12]=  0X14C33B00,
+[ASPEED_DEV_WDT]   =  0x14C37000,
+[ASPEED_DEV_VUART] =  0X14C3,
+[ASPEED_DEV_FMC]   =  0x1400,
+[ASPEED_DEV_SPI0]  =  0x1401,
+[ASPEED_DEV_SPI1]  =  0x1402,
+[ASPEED_DEV_SPI2]  =  0x1403,
+[ASPEED_DEV_SDRAM] =  0x4,
+[ASPEED_DEV_MII1]  =  0x1404,
+[ASPEED_DEV_MII2]  =  0x14040008,
+[ASPEED_DEV_MII3]  =  0x14040010,
+[ASPEED_DEV_ETH1]  =  0x1405,
+[ASPEED_DEV_ETH2]  =  0x1406,
+[ASPEED_DEV_ETH3]  =  0x1407,
+[ASPEED_DEV_EMMC]  =  0x1209,
+[ASPEED_DEV_INTC]  =  0x1210,
+[ASPEED_DEV_SLI]   =  0x12C17000,
+[ASPEED_DEV_SLIIO] =  0x14C1E000,
+[ASPEED_GIC_DIST]  =  0x1220,
+[ASPEED_GIC_REDIST]=  0x1228,
+};
+
+#define AST2700_MAX_IRQ 288
+
+/* Shared Peripheral Interrupt values below are offset by -32 from
+datasheet */ static const int aspeed_soc_ast2700_irqmap[] = {
+[ASPEED_DEV_UART0] = 132,
+[ASPEED_DEV_UART1] = 132,
+[ASPEED_DEV_UART2] = 132,
+[ASPEED_DEV_UART3] = 132,
+[ASPEED_DEV_UART4] = 8,
+[ASPEED_DEV_UART5] = 132,
+[ASPEED_DEV_UART6] = 132,
+[ASPEED_DEV_UART7] = 132,
+[ASPEED_DEV_UART8] = 132,
+[ASPEED_DEV_UART9] = 132,
+[ASPEED_DEV_UART10]= 132,
+[ASPEED_DEV_UART11]= 132,
+[ASPEED_DEV_UART12]= 132,
+[ASPEED_DEV_FMC]   = 131,
+[ASPEED_DEV_SDMC]  = 0,
+[ASPEED_DEV_SCU]   = 12,
+[ASPEED_DEV_ADC]   = 130,
+[ASPEED_DEV_XDMA]  = 5,
+[ASPEED_DEV_EMMC]  = 15,
+[ASPEED_DEV_GPIO]  = 11,
+[ASPEED_DEV_GPIO_1_8V] = 130,
+[ASPEED_DEV_RTC]   = 13,
+[ASPEED_DEV_TIMER1]= 16,
+[ASPEED_DEV_TIMER2]= 17,
+

Re: [PATCH v3 08/16] aspeed/smc: support 64 bits dma dram address

2024-05-17 Thread Cédric Le Goater


Hello Jamin

On 5/15/24 11:01, Jamin Lin wrote:

Hi Cedric,

Sorry reply you late.

Hello Jamin,

To handle the DMA DRAM Side Address High register, we should reintroduce an
"dram-base" property which I removed a while ago. Something like :



diff --git a/include/hw/ssi/aspeed_smc.h b/include/hw/ssi/aspeed_smc.h index
7f32e43ff6f3..6d8ef6bc968f 100644
--- a/include/hw/ssi/aspeed_smc.h
+++ b/include/hw/ssi/aspeed_smc.h
@@ -76,6 +76,7 @@ struct AspeedSMCState {
   AddressSpace flash_as;
   MemoryRegion *dram_mr;
   AddressSpace dram_as;
+uint64_t dram_base;

   AddressSpace wdt2_as;
   MemoryRegion *wdt2_mr;
diff --git a/hw/arm/aspeed_ast27x0.c b/hw/arm/aspeed_ast27x0.c index
38858e4fdec1..3417949ad8a3 100644
--- a/hw/arm/aspeed_ast27x0.c
+++ b/hw/arm/aspeed_ast27x0.c
@@ -500,6 +500,8 @@ static void aspeed_soc_ast2700_realize(DeviceState
*dev, Error **errp)
   }

   /* FMC, The number of CS is set at the board level */
+object_property_set_int(OBJECT(>fmc), "dram-base",
+sc->memmap[ASPEED_DEV_SDRAM],
+ _abort);
   object_property_set_link(OBJECT(>fmc), "dram",
OBJECT(s->dram_mr),
_abort);
   if (!sysbus_realize(SYS_BUS_DEVICE(>fmc), errp)) { diff --git
a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c index
3fa783578e9e..29ebfc0fd8c8 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -1372,6 +1372,7 @@ static const VMStateDescription
vmstate_aspeed_smc = {

   static Property aspeed_smc_properties[] = {
   DEFINE_PROP_BOOL("inject-failure", AspeedSMCState, inject_failure,
false),
+DEFINE_PROP_UINT64("dram-base", AspeedSMCState, dram_base, 0),
   DEFINE_PROP_LINK("dram", AspeedSMCState, dram_mr,
TYPE_MEMORY_REGION, MemoryRegion *),
   DEFINE_PROP_LINK("wdt2", AspeedSMCState, wdt2_mr,



I appreciate your kindly support and thanks for your suggestion.
Will add it.


See my aspeed-9.1 branch, I did some changes, mostly in the last patch.

* aspeed_smc_dma_len()

  - can use QEMU_ALIGN_UP(). simpler.

* aspeed_smc_dma_rw():

  - dram_addr ->  dma_dram_offset
  - There is no need to protect updates of the R_DMA_DRAM_ADDR_HIGH
register with aspeed_smc_has_dma_dram_addr_high() since it is
already protected with MMIO accesses. Skip the check and update
always.

* aspeed_smc_dma_dram_addr()

  - same as above.

You can merge the changes in the respective patches if you agree.

Still on the TODO list :

  - GIC review
  - aspeed/soc: fix incorrect dram size for AST2700
  




Thanks,

C.







With that, see below for more comments,

On 4/16/24 11:18, Jamin Lin wrote:

AST2700 support the maximum dram size is 8GiB and has a "DMA DRAM

Side

Address High Part(0x7C)"
register to support 64 bits dma dram address.
Add helper routines functions to compute the dma dram address, new
features and update trace-event to support 64 bits dram address.

Signed-off-by: Troy Lee 
Signed-off-by: Jamin Lin 
---
   hw/ssi/aspeed_smc.c | 66

+++--

   hw/ssi/trace-events |  2 +-
   2 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/hw/ssi/aspeed_smc.c b/hw/ssi/aspeed_smc.c index
71abc7a2d8..a67cac3d0f 100644
--- a/hw/ssi/aspeed_smc.c
+++ b/hw/ssi/aspeed_smc.c
@@ -132,6 +132,9 @@
   #define   FMC_WDT2_CTRL_BOOT_SOURCE  BIT(4) /* O: primary

1: alternate */

   #define   FMC_WDT2_CTRL_EN   BIT(0)

+/* DMA DRAM Side Address High Part (AST2700) */
+#define R_DMA_DRAM_ADDR_HIGH   (0x7c / 4)
+
   /* DMA Control/Status Register */
   #define R_DMA_CTRL(0x80 / 4)
   #define   DMA_CTRL_REQUEST  (1 << 31)
@@ -187,6 +190,7 @@
*   0x1FF: 32M bytes
*/
   #define DMA_DRAM_ADDR(asc, val)   ((val) & (asc)->dma_dram_mask)
+#define DMA_DRAM_ADDR_HIGH(val)   ((val) & 0xf)
   #define DMA_FLASH_ADDR(asc, val)  ((val) & (asc)->dma_flash_mask)
   #define DMA_LENGTH(val) ((val) & 0x01FF)

@@ -207,6 +211,7 @@ static const AspeedSegments

aspeed_2500_spi2_segments[];

   #define ASPEED_SMC_FEATURE_DMA   0x1
   #define ASPEED_SMC_FEATURE_DMA_GRANT 0x2
   #define ASPEED_SMC_FEATURE_WDT_CONTROL 0x4
+#define ASPEED_SMC_FEATURE_DMA_DRAM_ADDR_HIGH 0x08

   static inline bool aspeed_smc_has_dma(const AspeedSMCClass *asc)
   {
@@ -218,6 +223,11 @@ static inline bool

aspeed_smc_has_wdt_control(const AspeedSMCClass *asc)

   return !!(asc->features & ASPEED_SMC_FEATURE_WDT_CONTROL);
   }

+static inline bool aspeed_smc_has_dma_dram_addr_high(const
+AspeedSMCClass *asc) {
+return !!(asc->features &

ASPEED_SMC_FEATURE_DMA_DRAM_ADDR_HIGH);

+}
+
   #define aspeed_smc_error(fmt, ...)

\

   qemu_log_mask(LOG_GUEST_ERROR, "%s: " fmt "\n", __func__, ##
__VA_ARGS__)

@@ -747,6 +757,9 @@ static uint64_t aspeed_smc_read(void *opaque,

hwaddr addr, unsigned int size)

   (aspeed_smc_has_dma(asc) && addr == R_DMA_CTRL) ||
   (aspeed_smc_has_dma(asc) && addr == R_DMA_FLASH_ADDR)

[PATCH] hw/core/machine: move compatibility flags for VirtIO-net USO to machine 8.1

2024-05-17 Thread Fiona Ebner

Migration from an 8.2 or 9.0 binary to an 8.1 binary with machine
version 8.1 can fail with:

> kvm: Features 0x1c0010130afffa7 unsupported. Allowed features: 0x10179bfffe7
> kvm: Failed to load virtio-net:virtio
> kvm: error while loading state for instance 0x0 of device 
> ':00:12.0/virtio-net'
> kvm: load of migration failed: Operation not permitted

The series

53da8b5a99 virtio-net: Add support for USO features
9da1684954 virtio-net: Add USO flags to vhost support.
f03e0cf63b tap: Add check for USO features
2ab0ec3121 tap: Add USO support to tap device.

only landed in QEMU 8.2, so the compatibility flags should be part of
machine version 8.1.

Moving the flags unfortunately breaks forward migration with machine
version 8.1 from a binary without this patch to a binary with this
patch.

Fixes: 53da8b5a99 ("virtio-net: Add support for USO features")
Signed-off-by: Fiona Ebner 
---
 hw/core/machine.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index c7ceb11501..95051b80db 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -50,15 +50,15 @@ GlobalProperty hw_compat_8_1[] = {
 { "ramfb", "x-migrate", "off" },
 { "vfio-pci-nohotplug", "x-ramfb-migrate", "off" },
 { "igb", "x-pcie-flr-init", "off" },
+{ TYPE_VIRTIO_NET, "host_uso", "off"},
+{ TYPE_VIRTIO_NET, "guest_uso4", "off"},
+{ TYPE_VIRTIO_NET, "guest_uso6", "off"},
 };
 const size_t hw_compat_8_1_len = G_N_ELEMENTS(hw_compat_8_1);
 
 GlobalProperty hw_compat_8_0[] = {
 { "migration", "multifd-flush-after-each-section", "on"},
 { TYPE_PCI_DEVICE, "x-pcie-ari-nextfn-1", "on" },
-{ TYPE_VIRTIO_NET, "host_uso", "off"},
-{ TYPE_VIRTIO_NET, "guest_uso4", "off"},
-{ TYPE_VIRTIO_NET, "guest_uso6", "off"},
 };
 const size_t hw_compat_8_0_len = G_N_ELEMENTS(hw_compat_8_0);
 
-- 
2.39.2

Re: [PATCH] gitlab-ci: Replace Docker with Kaniko

2024-05-17 Thread Daniel P . Berrangé

On Fri, May 17, 2024 at 08:24:44AM +0200, Thomas Huth wrote:
> On 16/05/2024 20.24, Daniel P. Berrangé wrote:
> > On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote:
> > > Enables caching from the qemu-project repository.
> > > 
> > > Uses a dedicated "$NAME-cache" tag for caching, to address limitations.
> > > See issue "when using --cache=true, kaniko fail to push cache layer 
> > > [...]":
> > > https://github.com/GoogleContainerTools/kaniko/issues/1459
> ...
> > TL;DR: functionally this patch is capable of working. The key downside
> > is that it doubles our storage usage. I'm not convinced Kaniko offers
> > a compelling enough benefit to justify this penalty.
> 
> Will this patch fix the issues that we are currently seeing with the k8s
> runners not working in the upstream CI? If so, I think that would be enough
> benefit, wouldn't it?

Paolo said on IRC that he has reverted the changes to the runner which
caused us problems. Docker in Docker is still a documented & supported
option for GitLab AFAICT, so I would hope we can keep using it as
before.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH 1/9] monitor: Honor QMP request for fd removal immediately

2024-05-17 Thread Daniel P . Berrangé

On Thu, May 16, 2024 at 07:00:11PM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé  writes:
> 
> > On Fri, Apr 26, 2024 at 11:20:34AM -0300, Fabiano Rosas wrote:
> >> We're enabling using the fdset interface to pass file descriptors for
> >> use in the migration code. Since migrations can happen more than once
> >> during the VMs lifetime, we need a way to remove an fd from the fdset
> >> at the end of migration.
> >> 
> >> The current code only removes an fd from the fdset if the VM is
> >> running. This causes a QMP call to "remove-fd" to not actually remove
> >> the fd if the VM happens to be stopped.
> >> 
> >> While the fd would eventually be removed when monitor_fdset_cleanup()
> >> is called again, the user request should be honored and the fd
> >> actually removed. Calling remove-fd + query-fdset shows a recently
> >> removed fd still present.
> >> 
> >> The runstate_is_running() check was introduced by commit ebe52b592d
> >> ("monitor: Prevent removing fd from set during init"), which by the
> >> shortlog indicates that they were trying to avoid removing an
> >> yet-unduplicated fd too early.
> >
> > IMHO that should be reverted. The justification says
> >
> >   "If an fd is added to an fd set via the command line, and it is not
> >referenced by another command line option (ie. -drive), then clean
> >it up after QEMU initialization is complete"
> >
> > which I think is pretty weak. Why should QEMU forceably stop an app
> > from passing in an FD to be used by a QMP command issued just after
> > the VM starts running ?  While it could just use QMP to pass in the
> > FD set, the mgmt app might have its own reason for wanting QEMU to
> > own the passed FD from the very start of the process execve().
> 
> I don't think that's what that patch does. That description is
> misleading. I read it as:
> 
>"If an fd is added to an fd set via the command line, and it is not
> referenced by another command line option (ie. -drive), then clean
> it up ONLY after QEMU initialization is complete"
>   ^
> 
> By the subject ("monitor: Prevent removing fd from set during init") and
> the fact that this function is only called when the monitor connection
> closes, I believe the idea was to *save* the fds until after the VM
> starts running, i.e. some fd was being lost because
> monitor_fdset_cleanup() was being called before the dup().

I know that, but I'm saying QEMU should not be doing *any* generic cleanup
of passed in FDs at any point. 

A passed in FD should be taken by whatever part of the QEMU configuration
is told to use it when needed, and this takes responsibility for closing
it. If nothing is told to use the fdset /yet/, then it should stay in the
fdset untouched for later use.

If an application accidentally passes in a FD that it doesn't reference
in any configuration, that's simply a application bug to fix. QEMU does
not need to secondguess the app's intent and decide to arbitrarily close
it.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [PATCH v7 52/61] target/ppc/mmu-hash32.c: Inline and remove ppc_hash32_pte_prot()

2024-05-17 Thread Nicholas Piggin

On Mon May 13, 2024 at 9:28 AM AEST, BALATON Zoltan wrote:
> This is used only once and can be inlined.

This reminds me, ppc_hash32_pp_prot() calculates prot from
pp and nx (which is not from pp but from segment) and from
key of course. It could be renamed to say ppc_hash32_prot().
Maybe do that when you split out the rearranging of that
function.

Thanks,
Nick

>
> Signed-off-by: BALATON Zoltan 
> ---
>  target/ppc/mmu-hash32.c | 19 ---
>  1 file changed, 4 insertions(+), 15 deletions(-)
>
> diff --git a/target/ppc/mmu-hash32.c b/target/ppc/mmu-hash32.c
> index 8e5e83f46a..9de42713b3 100644
> --- a/target/ppc/mmu-hash32.c
> +++ b/target/ppc/mmu-hash32.c
> @@ -37,17 +37,6 @@
>  #  define LOG_BATS(...) do { } while (0)
>  #endif
>  
> -static int ppc_hash32_pte_prot(int mmu_idx,
> -   target_ulong sr, ppc_hash_pte32_t pte)
> -{
> -unsigned pp, key;
> -
> -key = ppc_hash32_key(mmuidx_pr(mmu_idx), sr);
> -pp = pte.pte1 & HPTE32_R_PP;
> -
> -return ppc_hash32_pp_prot(key, pp, !!(sr & SR32_NX));
> -}
> -
>  static target_ulong hash32_bat_size(int mmu_idx,
>  target_ulong batu, target_ulong batl)
>  {
> @@ -341,10 +330,10 @@ bool ppc_hash32_xlate(PowerPCCPU *cpu, vaddr eaddr, 
> MMUAccessType access_type,
>  CPUState *cs = CPU(cpu);
>  CPUPPCState *env = >env;
>  target_ulong sr;
> -hwaddr pte_offset;
> +hwaddr pte_offset, raddr;
>  ppc_hash_pte32_t pte;
> +bool key;
>  int prot;
> -hwaddr raddr;
>  
>  /* There are no hash32 large pages. */
>  *psizep = TARGET_PAGE_BITS;
> @@ -426,8 +415,8 @@ bool ppc_hash32_xlate(PowerPCCPU *cpu, vaddr eaddr, 
> MMUAccessType access_type,
>  "found PTE at offset %08" HWADDR_PRIx "\n", pte_offset);
>  
>  /* 7. Check access permissions */
> -
> -prot = ppc_hash32_pte_prot(mmu_idx, sr, pte);
> +key = ppc_hash32_key(mmuidx_pr(mmu_idx), sr);
> +prot = ppc_hash32_pp_prot(key, pte.pte1 & HPTE32_R_PP, sr & SR32_NX);
>  
>  if (!check_prot_access_type(prot, access_type)) {
>  /* Access right violation */

Re: [PATCH] gitlab-ci: Replace Docker with Kaniko

2024-05-17 Thread Thomas Huth


On 16/05/2024 20.24, Daniel P. Berrangé wrote:

On Thu, May 16, 2024 at 05:52:43PM +0100, Camilla Conte wrote:

Enables caching from the qemu-project repository.

Uses a dedicated "$NAME-cache" tag for caching, to address limitations.
See issue "when using --cache=true, kaniko fail to push cache layer [...]":
https://github.com/GoogleContainerTools/kaniko/issues/1459

...

TL;DR: functionally this patch is capable of working. The key downside
is that it doubles our storage usage. I'm not convinced Kaniko offers
a compelling enough benefit to justify this penalty.


Will this patch fix the issues that we are currently seeing with the k8s 
runners not working in the upstream CI? If so, I think that would be enough 
benefit, wouldn't it?


 Thomas

Re: [PATCH v7 36/61] target/ppc/mmu_common.c: Remove local name for a constant

2024-05-17 Thread Nicholas Piggin

On Mon May 13, 2024 at 9:28 AM AEST, BALATON Zoltan wrote:
> The mmask local variable is a less descriptive local name for a
> constant. Drop it and use the constant directly in the two places it
> is needed.

Wow, lots more. I might take up to patch 34ish for first PR.

Thanks,
Nick

>
> Signed-off-by: BALATON Zoltan 
> ---
>  target/ppc/mmu_common.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/target/ppc/mmu_common.c b/target/ppc/mmu_common.c
> index 9e0bfbda67..5d0090014a 100644
> --- a/target/ppc/mmu_common.c
> +++ b/target/ppc/mmu_common.c
> @@ -98,7 +98,7 @@ static int ppc6xx_tlb_pte_check(mmu_ctx_t *ctx, 
> target_ulong pte0,
>  target_ulong pte1, int h,
>  MMUAccessType access_type)
>  {
> -target_ulong ptem, mmask;
> +target_ulong ptem;
>  int ret, pteh, ptev, pp;
>  
>  ret = -1;
> @@ -108,12 +108,11 @@ static int ppc6xx_tlb_pte_check(mmu_ctx_t *ctx, 
> target_ulong pte0,
>  if (ptev && h == pteh) {
>  /* Check vsid & api */
>  ptem = pte0 & PTE_PTEM_MASK;
> -mmask = PTE_CHECK_MASK;
>  pp = pte1 & 0x0003;
>  if (ptem == ctx->ptem) {
>  if (ctx->raddr != (hwaddr)-1ULL) {
>  /* all matches should have equal RPN, WIMG & PP */
> -if ((ctx->raddr & mmask) != (pte1 & mmask)) {
> +if ((ctx->raddr & PTE_CHECK_MASK) != (pte1 & 
> PTE_CHECK_MASK)) {
>  qemu_log_mask(CPU_LOG_MMU, "Bad RPN/WIMG/PP\n");
>  return -3;
>  }

Re: [PATCH] hw/intc/s390_flic: Fix crash that occurs when saving the machine state

2024-05-17 Thread Cédric Le Goater


On 5/17/24 08:15, Thomas Huth wrote:

adapter_info_so_needed() treats its "opaque" parameter as a S390FLICState,
but the function belongs to a VMStateDescription that is attached to a
TYPE_VIRTIO_CCW_BUS device. This is currently causing a crash when the
user tries to save or migrate the VM state. Fix it by using s390_get_flic()
to get the correct device here instead.

Reported-by: Marc Hartmayer 
Fixes: 9d1b0f5bf5 ("s390_flic: add migration-enabled property")
Signed-off-by: Thomas Huth 



Reviewed-by: Cédric Le Goater 

Thanks,

C.



---
  hw/intc/s390_flic.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
index 7f93080087..6771645699 100644
--- a/hw/intc/s390_flic.c
+++ b/hw/intc/s390_flic.c
@@ -459,7 +459,7 @@ type_init(qemu_s390_flic_register_types)
  
  static bool adapter_info_so_needed(void *opaque)

  {
-S390FLICState *fs = S390_FLIC_COMMON(opaque);
+S390FLICState *fs = s390_get_flic();
  
  return fs->migration_enabled;

  }

[PATCH] hw/intc/s390_flic: Fix crash that occurs when saving the machine state

2024-05-17 Thread Thomas Huth

adapter_info_so_needed() treats its "opaque" parameter as a S390FLICState,
but the function belongs to a VMStateDescription that is attached to a
TYPE_VIRTIO_CCW_BUS device. This is currently causing a crash when the
user tries to save or migrate the VM state. Fix it by using s390_get_flic()
to get the correct device here instead.

Reported-by: Marc Hartmayer 
Fixes: 9d1b0f5bf5 ("s390_flic: add migration-enabled property")
Signed-off-by: Thomas Huth 
---
 hw/intc/s390_flic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/s390_flic.c b/hw/intc/s390_flic.c
index 7f93080087..6771645699 100644
--- a/hw/intc/s390_flic.c
+++ b/hw/intc/s390_flic.c
@@ -459,7 +459,7 @@ type_init(qemu_s390_flic_register_types)
 
 static bool adapter_info_so_needed(void *opaque)
 {
-S390FLICState *fs = S390_FLIC_COMMON(opaque);
+S390FLICState *fs = s390_get_flic();
 
 return fs->migration_enabled;
 }
-- 
2.45.0

[PATCH 2/2] qapi: List block-core.json in qapi-schema.json

2024-05-17 Thread Zhao Liu

Currently, block-core.json is not explicitly listed in the
qapi-schema.json.

To make the dependencies clearer, list block-core.json in section 2.

Signed-off-by: Zhao Liu 
---
 qapi/qapi-schema.json | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 57ea6bcb33e9..14196128c44e 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -77,6 +77,8 @@
 #
 # All their dependencies are listed in the section 1.
 
+# include common.json, crypto.json, job.json, sockets.json
+{ 'include': 'block-core.json' }
 # include sockets.json
 { 'include': 'char.json' }
 # include machine-common.json
@@ -96,19 +98,17 @@
 
 # Section 3. Files with 2-level dependencies.
 #
-# Their dependencies are either listed in the previous sections, or are
-# not listed but include files from the previous section. At least one
-# dependency is a 1-level dependency file.
+# All their dependencies are listed in the previous sections. At least one
+# dependency is from section 2.
 
-# include sockets.json (section 1), block-core.json (not listed, 1-level
-# dependencies)
+# include sockets.json (section 1), block-core.json (section 2)
 { 'include': 'block-export.json' }
-# include block-core.json (not listed, 1-level dependencies)
+# include block-core.json (section 2)
 { 'include': 'block.json' }
 # include authz.json (section 1), common.json (section 1), crypto.json
-# (section 1), block-core.json (not listed, 1-level dependencies)
+# (section 1), block-core.json (section 2)
 { 'include': 'qom.json' }
-# include block-core.json (not listed, 1-level dependencies)
+# include block-core.json (section 2)
 { 'include': 'transaction.json' }
 
 # Section 4. Files with 3-level dependencies.
-- 
2.34.1

[PATCH 1/2] qapi: Reorder and categorize json files in qapi-schema.json

2024-05-17 Thread Zhao Liu

Currently, the C code is generated sequentially in the order of the QAPI
json files in qapi-schema.json. This requires that the included file
must be listed first, before the file that includes it.

The current files' order implicitly fulfills this requirement, but
unclear dependency relationship makes it unfriendly for subsequent
dependency handling/adding new files.

While dependencies can be better handled by adding a sorting algorithm
to scripts/qapi/gen.py, to simplify and visualize the current API JSON
dependencies, sort them manually and categorize by dependency hierarchy.

Based on this, the new files should be placed in the corresponding
sections according to the dependencies/dependency hierarchy.

Signed-off-by: Zhao Liu 
---
 qapi/qapi-schema.json | 100 +-
 1 file changed, 69 insertions(+), 31 deletions(-)

diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json
index 5e33da7228f2..57ea6bcb33e9 100644
--- a/qapi/qapi-schema.json
+++ b/qapi/qapi-schema.json
@@ -38,45 +38,83 @@
 
 # Documentation generated with qapi-gen.py is in source order, with
 # included sub-schemas inserted at the first include directive
-# (subsequent include directives have no effect).  To get a sane and
-# stable order, it's best to include each sub-schema just once, or
-# include it first right here.
+# (subsequent include directives have no effect). Please place the
+# file correctly in the following sections according to the
+# dependencies.
+#
+# To get a sane and stable order, it's best to include each sub-schema
+# just once, or include it first right here.
 
-{ 'include': 'error.json' }
+# Section 1. Files without dependencies.
+
+{ 'include': 'acpi.json' }
+{ 'include': 'audio.json' }
+{ 'include': 'authz.json' }
 { 'include': 'common.json' }
-{ 'include': 'sockets.json' }
-{ 'include': 'run-state.json' }
+{ 'include': 'compat.json' }
+{ 'include': 'control.json' }
 { 'include': 'crypto.json' }
-{ 'include': 'job.json' }
-{ 'include': 'block.json' }
-{ 'include': 'block-export.json' }
-{ 'include': 'char.json' }
+{ 'include': 'cryptodev.json' }
+{ 'include': 'cxl.json' }
 { 'include': 'dump.json' }
-{ 'include': 'net.json' }
 { 'include': 'ebpf.json' }
-{ 'include': 'rocker.json' }
-{ 'include': 'tpm.json' }
-{ 'include': 'ui.json' }
-{ 'include': 'authz.json' }
-{ 'include': 'migration.json' }
-{ 'include': 'transaction.json' }
-{ 'include': 'trace.json' }
-{ 'include': 'compat.json' }
-{ 'include': 'control.json' }
+{ 'include': 'error.json' }
 { 'include': 'introspect.json' }
-{ 'include': 'qom.json' }
-{ 'include': 'qdev.json' }
+{ 'include': 'job.json' }
 { 'include': 'machine-common.json' }
-{ 'include': 'machine.json' }
-{ 'include': 'machine-target.json' }
-{ 'include': 'replay.json' }
-{ 'include': 'yank.json' }
-{ 'include': 'misc.json' }
 { 'include': 'misc-target.json' }
-{ 'include': 'audio.json' }
-{ 'include': 'acpi.json' }
 { 'include': 'pci.json' }
+{ 'include': 'rocker.json' }
+{ 'include': 'run-state.json' }
+{ 'include': 'sockets.json' }
 { 'include': 'stats.json' }
+{ 'include': 'tpm.json' }
+{ 'include': 'trace.json' }
 { 'include': 'virtio.json' }
-{ 'include': 'cryptodev.json' }
-{ 'include': 'cxl.json' }
+{ 'include': 'yank.json' }
+
+# Section 2. Files with 1-level dependencies.
+#
+# All their dependencies are listed in the section 1.
+
+# include sockets.json
+{ 'include': 'char.json' }
+# include machine-common.json
+{ 'include': 'machine-target.json' }
+# include common.json, machine-common.json
+{ 'include': 'machine.json' }
+# include common.json, sockets.json
+{ 'include': 'migration.json' }
+# include common.json
+{ 'include': 'misc.json' }
+# include sockets.json
+{ 'include': 'net.json' }
+# include common.json
+{ 'include': 'replay.json' }
+# include common.json, sockets.json
+{ 'include': 'ui.json' }
+
+# Section 3. Files with 2-level dependencies.
+#
+# Their dependencies are either listed in the previous sections, or are
+# not listed but include files from the previous section. At least one
+# dependency is a 1-level dependency file.
+
+# include sockets.json (section 1), block-core.json (not listed, 1-level
+# dependencies)
+{ 'include': 'block-export.json' }
+# include block-core.json (not listed, 1-level dependencies)
+{ 'include': 'block.json' }
+# include authz.json (section 1), common.json (section 1), crypto.json
+# (section 1), block-core.json (not listed, 1-level dependencies)
+{ 'include': 'qom.json' }
+# include block-core.json (not listed, 1-level dependencies)
+{ 'include': 'transaction.json' }
+
+# Section 4. Files with 3-level dependencies.
+#
+# All their dependencies are listed in the previous sections. At least one
+# dependency is from section 3.
+
+# include qom.json (section 3)
+{ 'include': 'qdev.json' }
-- 
2.34.1

[PATCH 0/2] qapi/qapi-schema: Clarify the dependency relationship

2024-05-17 Thread Zhao Liu

Hi,

At present, the correctness of the dependencies of JSON files is ensured
by the order in which they are listed, but in general, the mixing of
multiple files and the lack of clear guidelines for ordering them is not
friendly to extending and maintaining.

Therefore, I have a proposal to manually categorize and sort JSON files
generation order by dependencies/dependency hierarchy, to improve the
readability and maintainability of qapi-schema.json.

Welcome your feedback!

Thanks and Best Regards,
Zhao
---
Zhao Liu (2):
  qapi: Reorder and categorize json files in qapi-schema.json
  qapi: List block-core.json in qapi-schema.json

 qapi/qapi-schema.json | 100 +-
 1 file changed, 69 insertions(+), 31 deletions(-)

-- 
2.34.1

Re: [PATCH v7 35/61] target/ppc: Remove pp_check() and reuse ppc_hash32_pp_prot()

2024-05-17 Thread Nicholas Piggin

On Mon May 13, 2024 at 9:28 AM AEST, BALATON Zoltan wrote:
> The ppc_hash32_pp_prot() function in mmu-hash32.c is the same as
> pp_check() in mmu_common.c, merge these to remove duplicated code.
> Define the common function as static lnline otherwise exporting the
> function from mmu-hash32.c would stop the compiler inlining it which
> results in slightly lower performance.
>

It's already hard to review patches that move code around, it's better
to keep the changes before/after the move unless really necessary.

For mmu_common.c hunks,

Reviewed-by: Nicholas Piggin 

Thanks,
Nick

> Signed-off-by: BALATON Zoltan 
> ---
>  target/ppc/mmu-hash32.c | 45 -
>  target/ppc/mmu-hash32.h | 36 +
>  target/ppc/mmu_common.c | 44 ++--
>  3 files changed, 38 insertions(+), 87 deletions(-)
>
> diff --git a/target/ppc/mmu-hash32.c b/target/ppc/mmu-hash32.c
> index 1e8f1df0f0..d5f2057eb1 100644
> --- a/target/ppc/mmu-hash32.c
> +++ b/target/ppc/mmu-hash32.c
> @@ -37,51 +37,6 @@
>  #  define LOG_BATS(...) do { } while (0)
>  #endif
>  
> -static int ppc_hash32_pp_prot(int key, int pp, int nx)
> -{
> -int prot;
> -
> -if (key == 0) {
> -switch (pp) {
> -case 0x0:
> -case 0x1:
> -case 0x2:
> -prot = PAGE_READ | PAGE_WRITE;
> -break;
> -
> -case 0x3:
> -prot = PAGE_READ;
> -break;
> -
> -default:
> -abort();
> -}
> -} else {
> -switch (pp) {
> -case 0x0:
> -prot = 0;
> -break;
> -
> -case 0x1:
> -case 0x3:
> -prot = PAGE_READ;
> -break;
> -
> -case 0x2:
> -prot = PAGE_READ | PAGE_WRITE;
> -break;
> -
> -default:
> -abort();
> -}
> -}
> -if (nx == 0) {
> -prot |= PAGE_EXEC;
> -}
> -
> -return prot;
> -}
> -
>  static int ppc_hash32_pte_prot(int mmu_idx,
> target_ulong sr, ppc_hash_pte32_t pte)
>  {
> diff --git a/target/ppc/mmu-hash32.h b/target/ppc/mmu-hash32.h
> index 7119a63d97..bf99161858 100644
> --- a/target/ppc/mmu-hash32.h
> +++ b/target/ppc/mmu-hash32.h
> @@ -102,6 +102,42 @@ static inline void ppc_hash32_store_hpte1(PowerPCCPU 
> *cpu,
>  stl_phys(CPU(cpu)->as, base + pte_offset + HASH_PTE_SIZE_32 / 2, pte1);
>  }
>  
> +static inline int ppc_hash32_pp_prot(bool key, int pp, bool nx)
> +{
> +int prot;
> +
> +if (key) {
> +switch (pp) {
> +case 0x0:
> +prot = 0;
> +break;
> +case 0x1:
> +case 0x3:
> +prot = PAGE_READ;
> +break;
> +case 0x2:
> +prot = PAGE_READ | PAGE_WRITE;
> +break;
> +default:
> +g_assert_not_reached();
> +}
> +} else {
> +switch (pp) {
> +case 0x0:
> +case 0x1:
> +case 0x2:
> +prot = PAGE_READ | PAGE_WRITE;
> +break;
> +case 0x3:
> +prot = PAGE_READ;
> +break;
> +default:
> +g_assert_not_reached();
> +}
> +}
> +return nx ? prot : prot | PAGE_EXEC;
> +}
> +
>  typedef struct {
>  uint32_t pte0, pte1;
>  } ppc_hash_pte32_t;
> diff --git a/target/ppc/mmu_common.c b/target/ppc/mmu_common.c
> index e1462a25dd..9e0bfbda67 100644
> --- a/target/ppc/mmu_common.c
> +++ b/target/ppc/mmu_common.c
> @@ -77,44 +77,6 @@ void ppc_store_sdr1(CPUPPCState *env, target_ulong value)
>  
> /*/
>  /* PowerPC MMU emulation */
>  
> -static int pp_check(int key, int pp, int nx)
> -{
> -int access;
> -
> -/* Compute access rights */
> -access = 0;
> -if (key == 0) {
> -switch (pp) {
> -case 0x0:
> -case 0x1:
> -case 0x2:
> -access |= PAGE_WRITE;
> -/* fall through */
> -case 0x3:
> -access |= PAGE_READ;
> -break;
> -}
> -} else {
> -switch (pp) {
> -case 0x0:
> -access = 0;
> -break;
> -case 0x1:
> -case 0x3:
> -access = PAGE_READ;
> -break;
> -case 0x2:
> -access = PAGE_READ | PAGE_WRITE;
> -break;
> -}
> -}
> -if (nx == 0) {
> -access |= PAGE_EXEC;
> -}
> -
> -return access;
> -}
> -
>  int ppc6xx_tlb_getnum(CPUPPCState *env, target_ulong eaddr,
>  int way, int is_code)
>  {
> @@ -137,7 +99,7 @@ static int ppc6xx_tlb_pte_check(mmu_ctx_t *ctx, 
> target_ulong pte0,
>  MMUAccessType access_type)
>  {
>  target_ulong ptem, mmask;
> -int access, ret, pteh, ptev, pp;
> +int ret, pteh, ptev, pp;
>  
>  ret = -1;

Re: [PATCH v7 33/61] target/ppc: Add a function to check for page protection bit

2024-05-17 Thread Nicholas Piggin

On Mon May 13, 2024 at 9:28 AM AEST, BALATON Zoltan wrote:
> Checking if a page protection bit is set for a given access type is a
> common operation. Add a function to avoid repeating the same check at
> multiple places. As this relies on access type and page protection bit
> values having certain relation also add an assert to ensure that this
> assumption holds.
>
> Signed-off-by: BALATON Zoltan 
> ---
>  target/ppc/cpu_init.c|  5 +
>  target/ppc/internal.h| 23 +--
>  target/ppc/mmu-hash32.c  |  6 +++---
>  target/ppc/mmu-hash64.c  |  2 +-
>  target/ppc/mmu-radix64.c |  2 +-
>  target/ppc/mmu_common.c  | 26 +-
>  6 files changed, 28 insertions(+), 36 deletions(-)
>
> diff --git a/target/ppc/cpu_init.c b/target/ppc/cpu_init.c
> index 92c71b2a09..d3b92d9f0e 100644
> --- a/target/ppc/cpu_init.c
> +++ b/target/ppc/cpu_init.c
> @@ -7388,6 +7388,11 @@ static void ppc_cpu_class_init(ObjectClass *oc, void 
> *data)
>  #ifndef CONFIG_USER_ONLY
>  cc->sysemu_ops = _sysemu_ops;
>  INTERRUPT_STATS_PROVIDER_CLASS(oc)->get_statistics = ppc_get_irq_stats;
> +
> +/* check_prot_access_type relies on MMU access and PAGE bits relations */
> +qemu_build_assert(MMU_DATA_LOAD == 0 && MMU_DATA_STORE == 1 &&
> +  MMU_INST_FETCH == 2 && PAGE_READ == 1 &&
> +  PAGE_WRITE == 2 && PAGE_EXEC == 4);
>  #endif
>  
>  cc->gdb_num_core_regs = 71;
> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> index 4a90dd2584..20fb2ec593 100644
> --- a/target/ppc/internal.h
> +++ b/target/ppc/internal.h
> @@ -234,27 +234,14 @@ void destroy_ppc_opcodes(PowerPCCPU *cpu);
>  void ppc_gdb_init(CPUState *cs, PowerPCCPUClass *ppc);
>  const gchar *ppc_gdb_arch_name(CPUState *cs);
>  
> -/**
> - * prot_for_access_type:
> - * @access_type: Access type
> - *
> - * Return the protection bit required for the given access type.
> - */
> -static inline int prot_for_access_type(MMUAccessType access_type)
> +#ifndef CONFIG_USER_ONLY
> +
> +/* Check if permission bit required for the access_type is set in prot */
> +static inline int check_prot_access_type(int prot, MMUAccessType access_type)
>  {
> -switch (access_type) {
> -case MMU_INST_FETCH:
> -return PAGE_EXEC;
> -case MMU_DATA_LOAD:
> -return PAGE_READ;
> -case MMU_DATA_STORE:
> -return PAGE_WRITE;
> -}
> -g_assert_not_reached();
> +return prot & (1 << access_type);

I checked and sadly gcc is not able to figure this out on its own yet,
so we'll go with it. Nice improvement.

Reivewed-by: Nicholas Piggin 

>  }
>  
> -#ifndef CONFIG_USER_ONLY
> -
>  /* PowerPC MMU emulation */
>  
>  bool ppc_xlate(PowerPCCPU *cpu, vaddr eaddr, MMUAccessType access_type,
> diff --git a/target/ppc/mmu-hash32.c b/target/ppc/mmu-hash32.c
> index 3abaf16e78..1e8f1df0f0 100644
> --- a/target/ppc/mmu-hash32.c
> +++ b/target/ppc/mmu-hash32.c
> @@ -252,7 +252,7 @@ static bool ppc_hash32_direct_store(PowerPCCPU *cpu, 
> target_ulong sr,
>  }
>  
>  *prot = key ? PAGE_READ | PAGE_WRITE : PAGE_READ;
> -if (*prot & prot_for_access_type(access_type)) {
> +if (check_prot_access_type(*prot, access_type)) {
>  *raddr = eaddr;
>  return true;
>  }
> @@ -403,7 +403,7 @@ bool ppc_hash32_xlate(PowerPCCPU *cpu, vaddr eaddr, 
> MMUAccessType access_type,
>  if (env->nb_BATs != 0) {
>  raddr = ppc_hash32_bat_lookup(cpu, eaddr, access_type, protp, 
> mmu_idx);
>  if (raddr != -1) {
> -if (prot_for_access_type(access_type) & ~*protp) {
> +if (!check_prot_access_type(*protp, access_type)) {
>  if (guest_visible) {
>  if (access_type == MMU_INST_FETCH) {
>  cs->exception_index = POWERPC_EXCP_ISI;
> @@ -471,7 +471,7 @@ bool ppc_hash32_xlate(PowerPCCPU *cpu, vaddr eaddr, 
> MMUAccessType access_type,
>  
>  prot = ppc_hash32_pte_prot(mmu_idx, sr, pte);
>  
> -if (prot_for_access_type(access_type) & ~prot) {
> +if (!check_prot_access_type(prot, access_type)) {
>  /* Access right violation */
>  qemu_log_mask(CPU_LOG_MMU, "PTE access rejected\n");
>  if (guest_visible) {
> diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
> index 0966422a55..d9626f6aab 100644
> --- a/target/ppc/mmu-hash64.c
> +++ b/target/ppc/mmu-hash64.c
> @@ -1097,7 +1097,7 @@ bool ppc_hash64_xlate(PowerPCCPU *cpu, vaddr eaddr, 
> MMUAccessType access_type,
>  amr_prot = ppc_hash64_amr_prot(cpu, pte);
>  prot = exec_prot & pp_prot & amr_prot;
>  
> -need_prot = prot_for_access_type(access_type);
> +need_prot = check_prot_access_type(PAGE_RWX, access_type);
>  if (need_prot & ~prot) {
>  /* Access right violation */
>  qemu_log_mask(CPU_LOG_MMU, "PTE access rejected\n");
> diff --git a/target/ppc/mmu-radix64.c b/target/ppc/mmu-radix64.c
> index 395ce3b782..2c5ade5cea 100644
> ---

95 matches

Mail list logo