date:20230726

Re: [PATCH v2 3/8] machine: Print supported CPU models instead of typenames

2023-07-26 Thread Gavin Shan




On 7/27/23 09:08, Richard Henderson wrote:

On 7/25/23 17:32, Gavin Shan wrote:

-static const char *q800_machine_valid_cpu_types[] = {
+static const char * const q800_machine_valid_cpu_types[] = {
  M68K_CPU_TYPE_NAME("m68040"),
  NULL
  };
+static const char * const q800_machine_valid_cpu_models[] = {
+    "m68040",
+    NULL
+};


I really don't like this replication.



Right, it's going to be lots of replications, but gives much flexibility.
There are 21 targets and we don't have fixed pattern for the mapping between
CPU model name and CPU typename. I'm summarizing the used patterns like below.

  1 All CPU model names are mappinged to fixed CPU typename;
  2 CPU model name is same to CPU typename;
  3 CPU model name is alias to CPU typename;
  4 CPU model name is prefix of CPU typename;

  Target Categoriessuffix-of-CPU-typename
  ---
  alpha  -234  -alpha-cpu
  arm---4  -arm-cpu
  avr-2--
  cris   --34  -cris-cpu
  hexagon---4  -hexagon-cpu
  hppa   1---
  i386   ---4  -i386-cpu
  loongarch  -2-4  -loongarch-cpu
  m68k   ---4  -m68k-cpu
  microblaze 1---
  mips   ---4  -mips64-cpu  -mips-cpu
  nios2  1---
  openrisc   ---4  -or1k-cpu
  ppc--34  -powerpc64-cpu  -powerpc-cpu
  riscv  ---4  -riscv-cpu
  rx -2-4  -rx-cpu
  s390x  ---4  -s390x-cpu
  sh4--34  -superh-cpu
  sparc  -2--
  tricore---4  -tricore-cpu
  xtensa ---4  -xtensa-cpu

There are several options as below. Please let me know which one or something
else is the best.

(a) Keep what we have and use mc->valid_{cpu_types, cpu_models}[] to track
the valid CPU typenames and CPU model names.

(b) Introduce CPUClass::model_name_by_typename(). Every target has their own
implementation to convert CPU typename to CPU model name. The CPU model name
is parsed from mc->valid_cpu_types[i].

char *CPUClass::model_by_typename(const char *typename);

(c) As we discussed before, use mc->valid_cpu_type_suffix and 
mc->valid_cpu_models
because the CPU type check is currently needed by target arm/m68k/riscv where we
do have fixed pattern to convert CPU model names to CPU typenames. The CPU 
typename
is comprised of CPU model name and suffix. However, it won't be working when 
the CPU
type check is required by other target where we have patterns other than this.

Thanks,
Gavin

Re: [External] Re: [PATCH] memory: avoid updating ioeventfds for some address_space

2023-07-26 Thread hongmainquan





在 2023/7/27 1:45 上午, Peter Xu 写道:

On Tue, Jul 25, 2023 at 07:20:37PM +0800, hongmianquan wrote:

When updating ioeventfds, we need to iterate all address spaces,
but some address spaces do not register eventfd_add|del call when
memory_listener_register() and they do nothing when updating ioeventfds.
So we can skip these AS in address_space_update_ioeventfds().

The overhead of memory_region_transaction_commit() can be significantly
reduced. For example, a VM with 8 vhost net devices and each one has
64 vectors, can reduce the time spent on memory_region_transaction_commit by 
20%.

Signed-off-by: hongmianquan 


Reviewed-by: Peter Xu 

Should be for 8.2, though.  Please always copy Paolo for memory related
patches.  I hope Paolo can see this.

Thanks, I hope so. Also, I'm not quite sure what 'Should be for 8.2' 
means. Does it imply that there will be changes to this logic after 
version 8.2?

Re: [PATCH 0/8] Adds CPU hot-plug support to Loongarch

2023-07-26 Thread lixianglai


Hi Gavin and Salil,

On 7/27/23 8:57 AM, Gavin Shan wrote:

Hi Xianglai,

On 7/20/23 17:15, xianglai li wrote:

Hello everyone, We refer to the implementation of ARM CPU
Hot-Plug to add GED-based CPU Hot-Plug support to Loongarch.

The first 4 patches are changes to the QEMU common code,
including adding GED support for CPU Hot-Plug, updating
the ACPI table creation process, and adding 
qdev_disconnect_gpio_out_named

and cpu_address_space_destroy interfaces to release resources
when CPU un-plug.

The last four patches are Loongarch architecture-related,
and the modifications include the definition of the hook
function related to the CPU Hot-(UN)Plug, the allocation
and release of CPU resources when CPU Hot-(UN)Plug,
the creation process of updating the ACPI table,
and finally the custom switch for the CPU Hot-Plug.



[...]

It seems the design for ARM64 hotplug has been greatly referred, but 
the authors
are missed to be copied if you were referring to the following 
repository. There
will be conflicts between those two patchsets as I can see and I'm not 
sure how
it can be resolved. In theory, one patchset needs to be rebased on 
another one

since they're sharing large amount of codes.

  https://github.com/salil-mehta/qemu.git
  (branch: virt-cpuhp-armv8/rfc-v1-port11052023.dev-1)

I took a quick scan on this series. Loongarch doesn't have ARM64's 
constraint due
to vGIC3 where all vCPUs's file descriptor needs to be in place. It 
means the possible
and not yet present vCPU needs to be visible to KVM. Without this 
constraint, the

implementation is simplified a lot.


We have great respect and gratitude to Salil and his team for their work 
and contributions to CPU HotPlug,


which greatly reduced the difficulty of developing CPU HotPlug in Loongarch.

So, We plan to rebase the next version of patch based on their code.


Hi Salil,

I estimate that it will take quite some time for your patch series to 
merge in,


if allowed, can you merge your patch series changes to the common code 
section into the community first,


so that our code can be rebase and merged.




Besides, we found the vCPU hotplug isn't working for TCG when Salil's 
series was
tested. I'm not sure if we have same issue with this series, or TCG 
isn't a concern

to us at all?


At present, QEMU only supports TCG under Loongarch,

and we test CPU HotPlug is also carried out under QEMU TCG,

and CPU HotPlug can work normally when testing.

Of course, we are also very happy to see you testing CPU hotplug under 
Loongarch,


which can find more problems and help us improve our patch.



Thanks,
Gavin


Thanks,

xianglai li

Re: [PATCH] migrate/multifd: fix coredump when the multifd thread cleanup

2023-07-26 Thread chenyuhui (A)




On 2023/7/26 0:53, Peter Xu wrote:
> On Tue, Jul 25, 2023 at 04:43:28PM +0800, chenyuhui (A) wrote:
>> @Peter Xu @Fabiano Rosas
>> Kindly ping on this.
> 
> Ah I see what's missing - please copy maintainer (Juan) for any migration
> patches, especially multifd ones..  I'm doing that for this one, but I'd
> suggest you repost with a whole patch and information put into commit msg.
> 
> Thanks.
> 
Hi, Juan
This is a patch for migration，please take a look.

From: Yuhui Chen 
Date: Tue, 25 Jul 2023 10:45:48 +0800
Subject: [PATCH] migrate/multifd: fix coredump when the multifd thread cleanup

There is a coredump while trying to destroy mutex when
p->running is false but p->mutex is not unlock.
Make sure all mutexes has been released before destroy them.

Signed-off-by: Yuhui Chen 
---
 migration/multifd.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 3387d8277f..3a085bf3ec 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -521,9 +521,7 @@ void multifd_save_cleanup(void)
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDSendParams *p = _send_state->params[i];

-if (p->running) {
-qemu_thread_join(>thread);
-}
+qemu_thread_join(>thread);
 }
 for (i = 0; i < migrate_multifd_channels(); i++) {
 MultiFDSendParams *p = _send_state->params[i];
-- 
2.21.0.windows.1

>>
>> On 2023/6/27 9:11, chenyuhui (A) wrote:
>>>
>>> On 2023/6/26 21:16, chenyuhui (A) wrote:

 On 2023/6/21 22:22, Fabiano Rosas wrote:
> Jianguo Zhang via  writes:
>
>> From: Yuhui Chen 
>>
>> There is a coredump while trying to destroy mutex when
>> p->running is false but p->mutex is not unlock.
>> Make sure all mutexes has been released before destroy them.
>>
>> Signed-off-by: Yuhui Chen 
>> ---
>>  migration/multifd.c | 6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>
>> diff --git a/migration/multifd.c b/migration/multifd.c
>> index b7ad7002e0..7dcdb2d3a0 100644
>> --- a/migration/multifd.c
>> +++ b/migration/multifd.c
>> @@ -523,9 +523,7 @@ void multifd_save_cleanup(void)
>>  for (i = 0; i < migrate_multifd_channels(); i++) {
>>  MultiFDSendParams *p = _send_state->params[i];
>>  
>> -if (p->running) {
>
> The need for this flag is dubious IMO. Commit 10351fbad1
> ("migration/multifd: Join all multifd threads in order to avoid leaks")
> already moved the other join outside of it. If we figure out another way
> to deal with the sem_sync lockup we could probably remove this
> altogether.


 I've seen this commit 10351fbad1, and it's seems to have the same
 problem in function multifd_save_cleanup.

 So that may my patch only need to modify multifd_save_cleanup.

 __


 On 2023/6/21 21:24, Peter Xu wrote:
> On Wed, Jun 21, 2023 at 04:18:26PM +0800, Jianguo Zhang via wrote:
>> From: Yuhui Chen
>>
>> There is a coredump while trying to destroy mutex when
>> p->running is false but p->mutex is not unlock.
>> Make sure all mutexes has been released before destroy them.
>
> It'll be nice to add a backtrace of the coredump here, and also copy
> maintainer (Juan Quintela, copied now).
>

 The following is coredump, and my code is base on
 https://github.com/qemu/qemu.git tag v6.2.0.

>>> (gdb) bt
>>> #0  0xabe3b2b8 in  () at /usr/lib64/libc.so.6
>>> #1  0xabdf6d7c in raise () at /usr/lib64/libc.so.6
>>> #2  0xabde4d2c in abort () at /usr/lib64/libc.so.6
>>> #3  0xc67fcc10 in error_exit (err=, 
>>> msg=msg@entry=0xc6dc52b8 <__func__.33> "qemu_mutex_destroy") at 
>>> ../util/qemu-thread-posix.c:38
>>> #4  0xc67fce38 in qemu_mutex_destroy 
>>> (mutex=mutex@entry=0xfa1a4250) at ../util/qemu-thread-posix.c:71
>>> #5  0xc6055688 in multifd_save_cleanup () at 
>>> ../migration/multifd.c:555
>>> #6  0xc6050198 in migrate_fd_cleanup (s=s@entry=0xf7518800) at 
>>> ../migration/migration.c:1808
>>> #7  0xc6050384 in migrate_fd_cleanup_bh (opaque=0xf7518800) at 
>>> ../migration/migration.c:1850
>>> #8  0xc680d790 in aio_bh_call (bh=0xa0004c40) at 
>>> ../util/async.c:141
>>> #9  aio_bh_poll (ctx=ctx@entry=0xf73285a0) at ../util/async.c:169
>>> #10 0xc67f9e18 in aio_dispatch (ctx=0xf73285a0) at 
>>> ../util/aio-posix.c:381
>>> #11 0xc680d414 in aio_ctx_dispatch (source=, 
>>> callback=, user_data=) at ../util/async.c:311
>>> #12 0xac44cf88 in g_main_context_dispatch () at 
>>> /usr/lib64/libglib-2.0.so.0
>>> #13 0xc6819214 in glib_pollfds_poll () at ../util/main-loop.c:232
>>> #14 os_host_main_loop_wait (timeout=73500) at ../util/main-loop.c:255
>>> #15

Re: [PATCH v7 0/4] hw/ufs: Add Universal Flash Storage (UFS) support

2023-07-26 Thread Jeuk Kim


On 7/27/2023 4:36 AM, Stefan Hajnoczi wrote:

On Wed, Jul 26, 2023 at 02:30:49PM +0900, Jeuk Kim wrote:

Since v6:
- Add tests/qtest/ufs-test.c to test ufs initialisation and I/O
- Add struct UtpTaskReqDesc to include/block/ufs.h
- Fix ufs_log2() logic
- Fix ufs-lu to use 4K as default block size to match the ufs spec

Since I created a new file, tests/qtest/ufs-test.c, I added Laurent Vivier to 
the cc list.
Thank you.


Modulo the comments about the test case:

Reviewed-by: Stefan Hajnoczi 


Thank you very much for your detailed reviews so far.
I will resend the patch reflecting your comments on the test case.

Thanks again!
Jeuk

Re: [PATCH] target/i386: Fix reporting of CPU dies when nr_cores=nr_threads=1

2023-07-26 Thread Xiaoyao Li


On 7/24/2023 2:59 AM, 小太 wrote:

When QEMU is started with `-smp D,sockets=1,dies=D,cores=1,threads=1` (that
is, 1 socket with D dies but each die contains just a single thread), both
Linux and Windows guests incorrectly interprets the system as having D
sockets with 1 die each

Ultimately this is caused by various CPUID leaves not being die-aware in
their "threads per socket" calculations, so this patch fixes that

These changes are referenced to the AMD PPR for Family 19h Model 01h (Milan)
and Family 17h Model 01h (Naples) manuals:
  - CPUID_Fn0001_EBX[23:16]: Number of threads in the processor
 (Core::X86::Cpuid::SizeId[NC] + 1)
  - CPUID_Fn000B_EBX_x01[15:0]: Number of logical cores in processor
socket (not present until Rome)
  - CPUID_Fn8001_ECX[1]: Multi core product
 (Core::X86::Cpuid::SizeId[NC] != 0)
  - CPUID_Fn8008_ECX[7:0]: The number of threads in the package - 1
   (Core::X86::Cpuid::SizeId[NC])

Note there are two remaining occurences that I didn't touch:
  - CPUID_Fn801E_ECX[10:8]: Always 0 (1 node per processor) for Milan.
But for Naples, it can also be 2 or 4 nodes
where each node is defined as one or two
CCXes (CCD?). But Milan also has multiple
CCXes, so clearly the definition of a node is
different from model to model, so I've left
it untouched. (QEMU seems to use the Naples
definition)
  - MSR_CORE_THREAD_COUNT: This MSR doesn't exist on Milan or Naples


Is this patch specific to AMD CPU type? what's situation for Intel CPU?

Re: [PATCH 0/8] Adds CPU hot-plug support to Loongarch

2023-07-26 Thread Gavin Shan


Hi Xianglai,

On 7/20/23 17:15, xianglai li wrote:

Hello everyone, We refer to the implementation of ARM CPU
Hot-Plug to add GED-based CPU Hot-Plug support to Loongarch.

The first 4 patches are changes to the QEMU common code,
including adding GED support for CPU Hot-Plug, updating
the ACPI table creation process, and adding qdev_disconnect_gpio_out_named
and cpu_address_space_destroy interfaces to release resources
when CPU un-plug.

The last four patches are Loongarch architecture-related,
and the modifications include the definition of the hook
function related to the CPU Hot-(UN)Plug, the allocation
and release of CPU resources when CPU Hot-(UN)Plug,
the creation process of updating the ACPI table,
and finally the custom switch for the CPU Hot-Plug.



[...]

It seems the design for ARM64 hotplug has been greatly referred, but the authors
are missed to be copied if you were referring to the following repository. There
will be conflicts between those two patchsets as I can see and I'm not sure how
it can be resolved. In theory, one patchset needs to be rebased on another one
since they're sharing large amount of codes.

  https://github.com/salil-mehta/qemu.git
  (branch: virt-cpuhp-armv8/rfc-v1-port11052023.dev-1)

I took a quick scan on this series. Loongarch doesn't have ARM64's constraint 
due
to vGIC3 where all vCPUs's file descriptor needs to be in place. It means the 
possible
and not yet present vCPU needs to be visible to KVM. Without this constraint, 
the
implementation is simplified a lot.

Besides, we found the vCPU hotplug isn't working for TCG when Salil's series was
tested. I'm not sure if we have same issue with this series, or TCG isn't a 
concern
to us at all?

Thanks,
Gavin

[PATCH] tests/migration-test: Remove arch_target

2023-07-26 Thread Peter Xu

It is always NULL, so drop it.

Cc: Juan Quintela 
Fixes: Coverity CID 1518101
Signed-off-by: Peter Xu 
---
 tests/qtest/migration-test.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 62d3f37021..7ce1379ba8 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -701,7 +701,6 @@ static int test_migrate_start(QTestState **from, QTestState 
**to,
   const char *uri, MigrateStart *args)
 {
 g_autofree gchar *arch_source = NULL;
-g_autofree gchar *arch_target = NULL;
 /* options for source and target */
 g_autofree gchar *arch_opts = NULL;
 g_autofree gchar *cmd_source = NULL;
@@ -810,12 +809,11 @@ static int test_migrate_start(QTestState **from, 
QTestState **to,
  "-m %s "
  "-serial file:%s/dest_serial "
  "-incoming %s "
- "%s %s %s %s %s",
+ "%s %s %s %s",
  args->use_dirty_ring ?
  ",dirty-ring-size=4096" : "",
  memory_size, tmpfs, uri,
  arch_opts ? arch_opts : "",
- arch_target ? arch_target : "",
  shmem_opts,
  args->opts_target ? args->opts_target : "",
  ignore_stderr);
-- 
2.41.0

Re: how to build qemu 8.1 - keycodemapdb?

2023-07-26 Thread Michael Roth

Quoting Daniel P. Berrangé (2023-07-26 04:18:37)
> On Wed, Jul 26, 2023 at 12:05:41PM +0300, Michael Tokarev wrote:
> > 26.07.2023 11:50, Daniel P. Berrangé wrote:
> > ..
> > > > make-release.sh apparently does the right thing. But the published
> > > > tarball does not include the 3 required sub-projects anyway.
> > > > 
> > > > Is it about how the release is made?  What is used to make the
> > > > actual release tarball, is it not make-release.sh?
> > > 
> > > make-release is what I expect to be used for making release
> > > tarballs.
> > 
> > When I run ./scripts/make-release 8.1.0-rc1 , the resulting tarball
> > includes the necessary submodules in subprojects/.
> > 
> > It is more: it includes 2 copies of berkeley-softfloat & berkeley-testfloat,
> > one in subprojects/ and one in roms/edk2/ArmPkg/Library/ArmSoftFloatLib/ .
> > 
> > But the tarballs published on qemu.org does not include these.
> > 
> > So I conclude the tarballs were not created using make-release.sh.
> 
> I filed an issue for this and marked it as a release blocker.
> 
>   https://gitlab.com/qemu-project/qemu/-/issues/1791
> 
> rc0 was broken in the same way too.

Sorry for the breakage. I've updated the issue with the resolution and
re-uploaded fixed QEMU 8.1.0-rc1 tarballs to qemu.org.

I also hit a secondary issue where the new meson-wrapped subprojects now
require meson >= 0.55 when generating tarballs (since make-release doesn't
make use of the newer one provided locally by QEMU). Upgrading from Ubuntu
20.04 to 22.04 fixed this for me.

Hopefully things are good now.

-Mike

> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-https://fstop138.berrange.com :|
> |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
>

Re: [PATCH v2 3/8] machine: Print supported CPU models instead of typenames

2023-07-26 Thread Richard Henderson


On 7/25/23 17:32, Gavin Shan wrote:

-static const char *q800_machine_valid_cpu_types[] = {
+static const char * const q800_machine_valid_cpu_types[] = {
  M68K_CPU_TYPE_NAME("m68040"),
  NULL
  };
  
+static const char * const q800_machine_valid_cpu_models[] = {

+"m68040",
+NULL
+};


I really don't like this replication.

r~

Re: [PATCH v2 2/8] machine: Introduce helper is_cpu_type_supported()

2023-07-26 Thread Richard Henderson


On 7/25/23 17:31, Gavin Shan wrote:

The logic of checking if the specified CPU type is supported in
machine_run_board_init() is independent enough. Factor it out into
helper is_cpu_type_supported(). With this, machine_run_board_init()
looks a bit clean. Since we're here, @machine_class is renamed to
@mc to avoid multiple line spanning of code. The comments are tweaked
a bit either.

No functional change intended.

Signed-off-by: Gavin Shan 
---
  hw/core/machine.c | 82 +--
  1 file changed, 44 insertions(+), 38 deletions(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index d7e7f8f120..fe110e9b0a 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -1349,12 +1349,50 @@ out:
  return r;
  }
  
+static void is_cpu_type_supported(MachineState *machine, Error **errp)


Return type bool, false on failure.


+/* Check if the CPU type is supported */
+is_cpu_type_supported(machine, _err);
+if (local_err) {
+error_propagate(errp, local_err);


Fold call into if, and no need for local_error.


r~

[PATCH v2 0/2] Fix MCE handling on AMD hosts

2023-07-26 Thread John Allen

In the event that a guest process attempts to access memory that has
been poisoned in response to a deferred uncorrected MCE, an AMD system
will currently generate a SIGBUS error which will result in the entire
guest being shutdown. Ideally, we only want to kill the guest process
that accessed poisoned memory in this case.

This support has been included in qemu for Intel hosts for a long time,
but there are a couple of changes needed for AMD hosts. First, we will
need to expose the SUCCOR cpuid bit to guests. Second, we need to modify
the MCE injection code to avoid Intel specific behavior when we are
running on an AMD host.

v2:
  - Add "succor" feature word.
  - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.

John Allen (2):
  i386: Add support for SUCCOR feature
  i386: Fix MCE support for AMD hosts

 target/i386/cpu.c | 18 +-
 target/i386/cpu.h |  4 
 target/i386/helper.c  |  4 
 target/i386/kvm/kvm.c | 19 +--
 4 files changed, 38 insertions(+), 7 deletions(-)

-- 
2.39.3

[PATCH for-8.1] accel/tcg: Clear tcg_ctx->gen_tb on buffer overflow

2023-07-26 Thread Richard Henderson

On overflow of code_gen_buffer, we unlock the guest pages we had been
translating, but failed to clear gen_tb.  On restart, if we cannot
allocate a TB, we exit to the main loop to perform the flush of all
TBs as soon as possible.  With garbage in gen_tb, we hit an assert:

../src/accel/tcg/tb-maint.c:348:page_unlock__debug: \
assertion failed: (page_is_locked(pd))

Fixes: deba78709ae8 ("accel/tcg: Always lock pages before translation")
Signed-off-by: Richard Henderson 
---
 accel/tcg/translate-all.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index a1782db5dd..b2d4e22c17 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -374,6 +374,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
   "Restarting code generation for "
   "code_gen_buffer overflow\n");
 tb_unlock_pages(tb);
+tcg_ctx->gen_tb = NULL;
 goto buffer_overflow;
 
 case -2:
-- 
2.34.1

[PATCH v2 1/2] i386: Add support for SUCCOR feature

2023-07-26 Thread John Allen

Add cpuid bit definition for the SUCCOR feature. This cpuid bit is required to
be exposed to guests to allow them to handle machine check exceptions on AMD
hosts.

Reported-by: William Roche 
Signed-off-by: John Allen 
---
v2:
  - Add "succor" feature word.
  - Add case to kvm_arch_get_supported_cpuid for the SUCCOR feature.
---
 target/i386/cpu.c | 18 +-
 target/i386/cpu.h |  4 
 target/i386/kvm/kvm.c |  2 ++
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 06009b80e8..9708785255 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -872,6 +872,22 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
 },
 .tcg_features = TCG_7_1_EAX_FEATURES,
 },
+[FEAT_8000_0007_EBX] = {
+.type = CPUID_FEATURE_WORD,
+.feat_names = {
+NULL, "succor", NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+NULL, NULL, NULL, NULL,
+},
+.cpuid = { .eax = 0x8007, .reg = R_EBX, },
+.tcg_features = 0,
+.unmigratable_flags = 0,
+},
 [FEAT_8000_0007_EDX] = {
 .type = CPUID_FEATURE_WORD,
 .feat_names = {
@@ -5874,7 +5890,7 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, 
uint32_t count,
 break;
 case 0x8007:
 *eax = 0;
-*ebx = 0;
+*ebx = env->features[FEAT_8000_0007_EBX];
 *ecx = 0;
 *edx = env->features[FEAT_8000_0007_EDX];
 break;
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 3edaad7688..fccb9b5a97 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -555,6 +555,7 @@ typedef enum FeatureWord {
 FEAT_7_1_EAX,   /* CPUID[EAX=7,ECX=1].EAX */
 FEAT_8000_0001_EDX, /* CPUID[8000_0001].EDX */
 FEAT_8000_0001_ECX, /* CPUID[8000_0001].ECX */
+FEAT_8000_0007_EBX, /* CPUID[8000_0007].EBX */
 FEAT_8000_0007_EDX, /* CPUID[8000_0007].EDX */
 FEAT_8000_0008_EBX, /* CPUID[8000_0008].EBX */
 FEAT_C000_0001_EDX, /* CPUID[C000_0001].EDX */
@@ -856,6 +857,9 @@ typedef uint64_t FeatureWordArray[FEATURE_WORDS];
 /* Packets which contain IP payload have LIP values */
 #define CPUID_14_0_ECX_LIP  (1U << 31)
 
+/* RAS Features */
+#define CPUID_8000_0007_EBX_SUCCOR  (1U << 1)
+
 /* CLZERO instruction */
 #define CPUID_8000_0008_EBX_CLZERO  (1U << 0)
 /* Always save/restore FP error pointers */
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index f25837f63f..4b62138459 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -417,6 +417,8 @@ uint32_t kvm_arch_get_supported_cpuid(KVMState *s, uint32_t 
function,
  */
 cpuid_1_edx = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
 ret |= cpuid_1_edx & CPUID_EXT2_AMD_ALIASES;
+} else if (function == 0x8007 && reg == R_EBX) {
+ret |= CPUID_8000_0007_EBX_SUCCOR;
 } else if (function == KVM_CPUID_FEATURES && reg == R_EAX) {
 /* kvm_pv_unhalt is reported by GET_SUPPORTED_CPUID, but it can't
  * be enabled without the in-kernel irqchip
-- 
2.39.3

[PATCH v2 2/2] i386: Fix MCE support for AMD hosts

2023-07-26 Thread John Allen

For the most part, AMD hosts can use the same MCE injection code as Intel but,
there are instances where the qemu implementation is Intel specific. First, MCE
deliviery works differently on AMD and does not support broadcast. Second,
kvm_mce_inject generates MCEs that include a number of Intel specific status
bits. Modify kvm_mce_inject to properly generate MCEs on AMD platforms.

Reported-by: William Roche 
Signed-off-by: John Allen 
---
 target/i386/helper.c  |  4 
 target/i386/kvm/kvm.c | 17 +++--
 2 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/target/i386/helper.c b/target/i386/helper.c
index 533b29cb91..a6523858e0 100644
--- a/target/i386/helper.c
+++ b/target/i386/helper.c
@@ -76,6 +76,10 @@ int cpu_x86_support_mca_broadcast(CPUX86State *env)
 int family = 0;
 int model = 0;
 
+if (IS_AMD_CPU(env)) {
+return 0;
+}
+
 cpu_x86_version(env, , );
 if ((family == 6 && model >= 14) || family > 6) {
 return 1;
diff --git a/target/i386/kvm/kvm.c b/target/i386/kvm/kvm.c
index 4b62138459..87a50c8aaf 100644
--- a/target/i386/kvm/kvm.c
+++ b/target/i386/kvm/kvm.c
@@ -532,16 +532,21 @@ static void kvm_mce_inject(X86CPU *cpu, hwaddr paddr, int 
code)
 CPUState *cs = CPU(cpu);
 CPUX86State *env = >env;
 uint64_t status = MCI_STATUS_VAL | MCI_STATUS_UC | MCI_STATUS_EN |
-  MCI_STATUS_MISCV | MCI_STATUS_ADDRV | MCI_STATUS_S;
+  MCI_STATUS_MISCV | MCI_STATUS_ADDRV;
 uint64_t mcg_status = MCG_STATUS_MCIP;
 int flags = 0;
 
-if (code == BUS_MCEERR_AR) {
-status |= MCI_STATUS_AR | 0x134;
-mcg_status |= MCG_STATUS_EIPV;
+if (!IS_AMD_CPU(env)) {
+status |= MCI_STATUS_S;
+if (code == BUS_MCEERR_AR) {
+status |= MCI_STATUS_AR | 0x134;
+mcg_status |= MCG_STATUS_EIPV;
+} else {
+status |= 0xc0;
+mcg_status |= MCG_STATUS_RIPV;
+}
 } else {
-status |= 0xc0;
-mcg_status |= MCG_STATUS_RIPV;
+mcg_status |= MCG_STATUS_EIPV | MCG_STATUS_RIPV;
 }
 
 flags = cpu_x86_support_mca_broadcast(env) ? MCE_INJECT_BROADCAST : 0;
-- 
2.39.3

Implementing "tee" in python asyncio

2023-07-26 Thread John Snow

Hi folks,

I'm currently wondering how to take a StreamReader as found on
https://docs.python.org/3/library/asyncio-subprocess.html#asyncio.subprocess.Process
and to consume the data while optionally re-streaming it to a
secondary consumer.

What I'd like to do is create a StreamWatcher class that consumes
console data while optionally logging to python logging and/or a file;
but re-buffers the data into an async stream where an additional
consumer is free to use the "standard asyncio API" to consume console
data at their leisure in a way that's unsurprising.

What I'd like this *for* is to be able to do aggressive logging of
stdout/stderr and console data without denying tests the ability to
consume the data as they see fit for their testing purposes. I want to
have my cake and eat it too, and we don't do a good job of managing
this consistently across the board.

I am wondering if there's any way around creating a literal socketpair
and suffering the creation of a full four StreamReader/StreamWriter
instances (one pair per socket...) and then just hanging on to the
"unused" reader/writer per each. It seems kind of foolishly excessive.
It also seems like it might be a pain in the butt if I want
cross-platform compatibility with windows for the machine appliance.

Anyone got any bright ideas?

--js

Re: [PATCH V1 2/3] migration: fix suspended runstate

2023-07-26 Thread Peter Xu

On Fri, Jun 30, 2023 at 09:50:41AM -0400, Steven Sistare wrote:
> On 6/26/2023 2:27 PM, Peter Xu wrote:
> > On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote:
> >> On 6/21/2023 4:28 PM, Peter Xu wrote:
> >>> On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
>  On 6/20/2023 5:46 PM, Peter Xu wrote:
> > On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> >> Migration of a guest in the suspended state is broken.  The incoming
> >> migration code automatically tries to wake the guest, which IMO is
> >> wrong -- the guest should end migration in the same state it started.
> >> Further, the wakeup is done by calling qemu_system_wakeup_request(), 
> >> which
> >> bypasses vm_start().  The guest appears to be in the running state, but
> >> it is not.
> >>
> >> To fix, leave the guest in the suspended state, but call
> >> qemu_system_start_on_wakeup_request() so the guest is properly resumed
> >> later, when the client sends a system_wakeup command.
> >>
> >> Signed-off-by: Steve Sistare 
> >> ---
> >>  migration/migration.c | 11 ---
> >>  softmmu/runstate.c|  1 +
> >>  2 files changed, 5 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/migration/migration.c b/migration/migration.c
> >> index 17b4b47..851fe6d 100644
> >> --- a/migration/migration.c
> >> +++ b/migration/migration.c
> >> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void 
> >> *opaque)
> >>  vm_start();
> >>  } else {
> >>  runstate_set(global_state_get_runstate());
> >> +if (runstate_check(RUN_STATE_SUSPENDED)) {
> >> +/* Force vm_start to be called later. */
> >> +qemu_system_start_on_wakeup_request();
> >> +}
> >
> > Is this really needed, along with patch 1?
> >
> > I have a very limited knowledge on suspension, so I'm prone to making
> > mistakes..
> >
> > But from what I read this, qemu_system_wakeup_request() (existing one, 
> > not
> > after patch 1 applied) will setup wakeup_reason and kick the main thread
> > using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be 
> > done in
> > the main thread later on after qemu_wakeup_requested() returns true.
> 
>  Correct, here:
> 
>  if (qemu_wakeup_requested()) {
>  pause_all_vcpus();
>  qemu_system_wakeup();
>  notifier_list_notify(_notifiers, _reason);
>  wakeup_reason = QEMU_WAKEUP_REASON_NONE;
>  resume_all_vcpus();
>  qapi_event_send_wakeup();
>  }
> 
>  However, that is not sufficient, because vm_start() was never called on 
>  the incoming
>  side.  vm_start calls the vm state notifiers for RUN_STATE_RUNNING, 
>  among other things.
> 
> 
>  Without my fixes, it "works" because the outgoing migration 
>  automatically wakes a suspended
>  guest, which sets the state to running, which is saved in global state:
> 
>  void migration_completion(MigrationState *s)
>  qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>  global_state_store()
> 
>  Then the incoming migration calls vm_start here:
> 
>  migration/migration.c
>  if (!global_state_received() ||
>  global_state_get_runstate() == RUN_STATE_RUNNING) {
>  if (autostart) {
>  vm_start();
> 
>  vm_start must be called for correctness.
> >>>
> >>> I see.  Though I had a feeling that this is still not the right way to do,
> >>> at least not as clean.
> >>>
> >>> One question is, would above work for postcopy when VM is suspended during
> >>> the switchover?
> >>
> >> Good catch, that is broken.
> >> I added qemu_system_start_on_wakeup_request to 
> >> loadvm_postcopy_handle_run_bh
> >> and now it works.
> >>
> >> if (global_state_get_runstate() == RUN_STATE_RUNNING) {
> >> if (autostart) {
> >> vm_start();
> >> } else {
> >> runstate_set(RUN_STATE_PAUSED);
> >> }
> >> } else {
> >> runstate_set(global_state_get_runstate());
> >> if (runstate_check(RUN_STATE_SUSPENDED)) {
> >> qemu_system_start_on_wakeup_request();
> >> }
> >> }
> >>
> >>> I think I see your point that vm_start() (mostly vm_prepare_start())
> >>> contains a bunch of operations that maybe we must have before starting the
> >>> VM, but then.. should we just make that vm_start() unconditional when
> >>> loading VM completes?  I just don't see anything won't need it (besides
> >>> -S), even COLO.
> >>>
> >>> So I'm wondering about something like this:
> >>>
> >>> ===8<===
> >>> --- a/migration/migration.c
> >>> +++ b/migration/migration.c
> >>> @@ -481,19 +481,28 @@ static void

[PATCH QEMU 2/2] MAINTAINERS: Add Hyman Huang to dirty-limit feature

2023-07-26 Thread ~hyman

From: Hyman Huang(黄勇) 

Signed-off-by: Hyman Huang(黄勇) 
---
 MAINTAINERS | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 12e59b6b27..d72fd63a8e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3437,6 +3437,12 @@ F: hw/core/clock-vmstate.c
 F: hw/core/qdev-clock.c
 F: docs/devel/clocks.rst
 
+Dirty-limit feature
+M: Hyman Huang 
+S: Maintained
+F: softmmu/dirtylimit.c
+F: include/sysemu/dirtylimit.h
+
 Usermode Emulation
 --
 Overall usermode emulation
-- 
2.38.5

[PATCH QEMU 1/2] qapi: Reformat and craft the migration doc comments

2023-07-26 Thread ~hyman

From: Hyman Huang(黄勇) 

Reformat migration doc comments to conform to current conventions
as commit a937b6aa739 (qapi: Reformat doc comments to conform to
current conventions).

Also, craft the dirty-limit capability comment.

Signed-off-by: Hyman Huang(黄勇) 
---
 qapi/migration.json | 66 +
 1 file changed, 31 insertions(+), 35 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 6b49593d2f..5d5649c885 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -258,17 +258,17 @@
 # blocked.  Present and non-empty when migration is blocked.
 # (since 6.0)
 #
-# @dirty-limit-throttle-time-per-round: Maximum throttle time (in 
microseconds) of virtual
-#   CPUs each dirty ring full round, which 
shows how
-#   MigrationCapability dirty-limit 
affects the guest
-#   during live migration. (since 8.1)
-#
-# @dirty-limit-ring-full-time: Estimated average dirty ring full time (in 
microseconds)
-#  each dirty ring full round, note that the value 
equals
-#  dirty ring memory size divided by average dirty 
page rate
-#  of virtual CPU, which can be used to observe 
the average
-#  memory load of virtual CPU indirectly. Note 
that zero
-#  means guest doesn't dirty memory (since 8.1)
+# @dirty-limit-throttle-time-per-round: Maximum throttle time
+# (in microseconds) of virtual CPUs each dirty ring full round,
+# which shows how MigrationCapability dirty-limit affects the
+# guest during live migration.  (Since 8.1)
+#
+# @dirty-limit-ring-full-time: Estimated average dirty ring full
+# time (in microseconds) each dirty ring full round. The value
+# equals dirty ring memory size divided by average dirty page
+# rate of the virtual CPU, which can be used to observe the
+# average memory load of the virtual CPU indirectly. Note that
+# zero means guest doesn't dirty memory.  (Since 8.1)
 #
 # Since: 0.14
 ##
@@ -519,15 +519,11 @@
 # are present.  'return-path' capability must be enabled to use
 # it.  (since 8.1)
 #
-# @dirty-limit: If enabled, migration will use the dirty-limit algo to
-#   throttle down guest instead of auto-converge algo.
-#   Throttle algo only works when vCPU's dirtyrate greater
-#   than 'vcpu-dirty-limit', read processes in guest os
-#   aren't penalized any more, so this algo can improve
-#   performance of vCPU during live migration. This is an
-#   optional performance feature and should not affect the
-#   correctness of the existing auto-converge algo.
-#   (since 8.1)
+# @dirty-limit: If enabled, migration will throttle vCPUs as needed to
+# keep their dirty page rate within @vcpu-dirty-limit.  This can
+# improve responsiveness of large guests during live migration,
+# and can result in more stable read performance.  Requires KVM
+# with accelerator property "dirty-ring-size" set.  (Since 8.1)
 #
 # Features:
 #
@@ -822,17 +818,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during  live migration. Should be in the range 1 to 1000ms,
+# defaults to 1000ms.  (Since 8.1)
 #
 # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
-#Defaults to 1. (Since 8.1)
+# Defaults to 1.  (Since 8.1)
 #
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
-#are experimental.
+# are experimental.
 #
 # Since: 2.4
 ##
@@ -988,17 +984,17 @@
 # Nodes are mapped to their block device name if there is one, and
 # to their node name otherwise.  (Since 5.2)
 #
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty limit 
during
-# live migration. Should be in the range 1 to 
1000ms,
-# defaults to 1000ms. (Since 8.1)
+# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
+# limit during live migration. Should be in the range 1 to 1000ms,
+# defaults to 1000ms.  (Since 8.1)
 #
 # @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
-#Defaults to 1. (Since 8.1)
+# Defaults to 1.  (Since 8.1)
 #
 # Features:
 #
 # @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
-#are experimental.
+#

[PATCH QEMU 0/2] migration: craft the doc comments

2023-07-26 Thread ~hyman

Hi, Markus,

This patchset aims to reformat migration doc comments
as commit a937b6aa739. Meanwhile, add myself
to the dirty-limit feature maintainer list.

Please review, Thanks.

Yong


Hyman Huang(黄勇) (2):
  qapi: Reformat and craft the migration doc comments
  MAINTAINERS: Add Hyman Huang to dirty-limit feature

 MAINTAINERS |  6 +
 qapi/migration.json | 66 +
 2 files changed, 37 insertions(+), 35 deletions(-)

-- 
2.38.5

Re: [PATCH v7 0/4] hw/ufs: Add Universal Flash Storage (UFS) support

2023-07-26 Thread Stefan Hajnoczi

On Wed, Jul 26, 2023 at 02:30:49PM +0900, Jeuk Kim wrote:
> Since v6:
> - Add tests/qtest/ufs-test.c to test ufs initialisation and I/O
> - Add struct UtpTaskReqDesc to include/block/ufs.h
> - Fix ufs_log2() logic
> - Fix ufs-lu to use 4K as default block size to match the ufs spec
> 
> Since I created a new file, tests/qtest/ufs-test.c, I added Laurent Vivier to 
> the cc list.
> Thank you.

Modulo the comments about the test case:

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [PATCH v7 4/4] tests/qtest: Introduce tests for UFS

2023-07-26 Thread Stefan Hajnoczi

On Wed, Jul 26, 2023 at 02:30:53PM +0900, Jeuk Kim wrote:
> +static void *ufs_blk_test_setup(GString *cmd_line, void *arg)
> +{
> +char *tmp_path = drive_create();
> +
> +g_string_append_printf(cmd_line,
> +   " -drive if=none,id=drv1,file=%s,"
> +   "format=raw,auto-read-only=off "

The newer -blockdev syntax can be used:

-blockdev file,filename=%s,node-name=drv1

> +static void ufs_register_nodes(void)
> +{
> +QOSGraphEdgeOptions edge_opts = {
> +.before_cmd_line = "-drive id=drv0,if=none,file=null-co://,"
> +   "file.read-zeroes=on,format=raw",

-blockdev null-co,node-name=drv0,read-zeroes=on


signature.asc
Description: PGP signature

Re: [PATCH] memory: avoid updating ioeventfds for some address_space

2023-07-26 Thread Peter Xu

On Tue, Jul 25, 2023 at 07:20:37PM +0800, hongmianquan wrote:
> When updating ioeventfds, we need to iterate all address spaces,
> but some address spaces do not register eventfd_add|del call when
> memory_listener_register() and they do nothing when updating ioeventfds.
> So we can skip these AS in address_space_update_ioeventfds().
> 
> The overhead of memory_region_transaction_commit() can be significantly
> reduced. For example, a VM with 8 vhost net devices and each one has
> 64 vectors, can reduce the time spent on memory_region_transaction_commit by 
> 20%.
> 
> Signed-off-by: hongmianquan 

Reviewed-by: Peter Xu 

Should be for 8.2, though.  Please always copy Paolo for memory related
patches.  I hope Paolo can see this.

-- 
Peter Xu

[PATCH 6/6] target/ppc: Migrate DECR SPR

2023-07-26 Thread Nicholas Piggin

TCG does not maintain the DEC reigster in the SPR array, so it does get
migrated. TCG also needs to re-start the decrementer timer on the
destination machine.

Load and store the decrementer into the SPR when migrating. This works
for the level-triggered (book3s) decrementer, and should be compatible
with existing KVM machines that do keep the DEC value there.

This fixes lost decrementer interrupt on migration that can cause
hangs, as well as other problems including record-replay bugs.

Signed-off-by: Nicholas Piggin 
---
 target/ppc/machine.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index 134b16c625..ebb37e07d0 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -209,6 +209,14 @@ static int cpu_pre_save(void *opaque)
 /* Used to retain migration compatibility for pre 6.0 for 601 machines. */
 env->hflags_compat_nmsr = 0;
 
+if (tcg_enabled()) {
+/*
+ * TCG does not maintain the DECR spr (unlike KVM) so have to save
+ * it here.
+ */
+env->spr[SPR_DECR] = cpu_ppc_load_decr(env);
+}
+
 return 0;
 }
 
@@ -314,6 +322,12 @@ static int cpu_post_load(void *opaque, int version_id)
 post_load_update_msr(env);
 
 if (tcg_enabled()) {
+/*
+ * TCG needs to re-start the decrementer timer and/or raise the
+ * interrupt. This works for level-triggered decrementer. Edge
+ * triggered types (including HDEC) would need to carry more state.
+ */
+cpu_ppc_store_decr(env, env->spr[SPR_DECR]);
 pmu_mmcr01_updated(env);
 }
 
-- 
2.40.1

[PATCH 5/7] tests/avocado: boot ppc64 pseries replay-record test to Linux VFS mount

2023-07-26 Thread Nicholas Piggin

This the ppc64 record-replay test is able to replay the full kernel boot
so try enabling it.

Cc: Pavel Dovgalyuk 
Signed-off-by: Nicholas Piggin 
---
 tests/avocado/replay_kernel.py | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/avocado/replay_kernel.py b/tests/avocado/replay_kernel.py
index 79c607b0e7..a18610542e 100644
--- a/tests/avocado/replay_kernel.py
+++ b/tests/avocado/replay_kernel.py
@@ -255,8 +255,7 @@ def test_ppc64_pseries(self):
 kernel_path = self.fetch_asset(kernel_url, asset_hash=kernel_hash)
 
 kernel_command_line = self.KERNEL_COMMON_COMMAND_LINE + 'console=hvc0'
-# icount is not good enough for PPC64 for complete boot yet
-console_pattern = 'Kernel command line: %s' % kernel_command_line
+console_pattern = 'VFS: Cannot open root device'
 self.run_rr(kernel_path, kernel_command_line, console_pattern)
 
 def test_ppc64_powernv(self):
-- 
2.40.1

[PATCH 3/6] target/ppc: Fix pending HDEC when entering PM state

2023-07-26 Thread Nicholas Piggin

HDEC is defined to not wake from PM state. There is a check in the HDEC
timer to avoid setting the interrupt if we are in a PM state, but no
check on PM entry to lower HDEC if it already fired. This can cause a
HDECR wake up and  QEMU abort with unsupported exception in Power Save
mode.

Fixes: 4b236b621bf ("ppc: Initial HDEC support")
Signed-off-by: Nicholas Piggin 
---
 target/ppc/excp_helper.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
index 003805b202..9aa8e46566 100644
--- a/target/ppc/excp_helper.c
+++ b/target/ppc/excp_helper.c
@@ -2685,6 +2685,12 @@ void helper_pminsn(CPUPPCState *env, uint32_t insn)
 env->resume_as_sreset = (insn != PPC_PM_STOP) ||
 (env->spr[SPR_PSSCR] & PSSCR_EC);
 
+/* HDECR is not to wake from PM state, it may have already fired */
+if (env->resume_as_sreset) {
+PowerPCCPU *cpu = env_archcpu(env);
+ppc_set_irq(cpu, PPC_INTERRUPT_HDECR, 0);
+}
+
 ppc_maybe_interrupt(env);
 }
 #endif /* defined(TARGET_PPC64) */
-- 
2.40.1

Re: [PATCH] gitlab: remove duplication between msys jobs

2023-07-26 Thread Thomas Huth


On 26/07/2023 18.19, Daniel P. Berrangé wrote:

Although they share a common parent, the two msys jobs still have
massive duplication in their script definitions that can easily be
collapsed.

Signed-off-by: Daniel P. Berrangé 
---
  .gitlab-ci.d/windows.yml | 132 +++
  1 file changed, 49 insertions(+), 83 deletions(-)


We originally had different sets of packages in the 32-bit and 64-bit jobs, 
to distribute the load between the two jobs ... but it got unified in commit 
14547e0877f3522. Now considering that we are facing timeouts again, we 
should maybe rather revert that commit instead of unifying the lists forever?


Anyway, before we unify the compiler package name suffix between the two 
jobs, I really would like to see whether the mingw Clang builds QEMU faster 
in the 64-bit job ... but so far I failed to convince meson to accept the 
Clang from the mingw package ... does anybody know how to use Clang with 
MSYS2 properly?


 Thomas

[PATCH 4/7] spapr: Fix record-replay machine reset consuming too many events

2023-07-26 Thread Nicholas Piggin

spapr_machine_reset gets a random number to populate the device-tree
rng seed with. When loading a snapshot for record-replay, the machine
is reset again, and that tries to consume the random event record
again, crashing due to inconsistent record

Fix this by saving the seed to populate the device tree with, and
skipping the rng on snapshot load.

Cc: Pavel Dovgalyuk 
Signed-off-by: Nicholas Piggin 
---
 hw/ppc/spapr.c | 12 +---
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7d84244f03..ecfbdb0030 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1022,7 +1022,6 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, 
void *fdt, bool reset)
 {
 MachineState *machine = MACHINE(spapr);
 SpaprMachineClass *smc = SPAPR_MACHINE_GET_CLASS(machine);
-uint8_t rng_seed[32];
 int chosen;
 
 _FDT(chosen = fdt_add_subnode(fdt, 0, "chosen"));
@@ -1100,8 +1099,7 @@ static void spapr_dt_chosen(SpaprMachineState *spapr, 
void *fdt, bool reset)
 spapr_dt_ov5_platform_support(spapr, fdt, chosen);
 }
 
-qemu_guest_getrandom_nofail(rng_seed, sizeof(rng_seed));
-_FDT(fdt_setprop(fdt, chosen, "rng-seed", rng_seed, sizeof(rng_seed)));
+_FDT(fdt_setprop(fdt, chosen, "rng-seed", spapr->fdt_rng_seed, 32));
 
 _FDT(spapr_dt_ovec(fdt, chosen, spapr->ov5_cas, "ibm,architecture-vec-5"));
 }
@@ -1654,6 +1652,14 @@ static void spapr_machine_reset(MachineState *machine, 
ShutdownCause reason)
 void *fdt;
 int rc;
 
+if (reason != SHUTDOWN_CAUSE_SNAPSHOT_LOAD) {
+/*
+ * Record-replay snapshot load must not consume random, this was
+ * already replayed from initial machine reset.
+ */
+qemu_guest_getrandom_nofail(spapr->fdt_rng_seed, 32);
+}
+
 pef_kvm_reset(machine->cgs, _fatal);
 spapr_caps_apply(spapr);
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index f47e8419a5..f4bd204d86 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -204,6 +204,7 @@ struct SpaprMachineState {
 uint32_t fdt_size;
 uint32_t fdt_initial_size;
 void *fdt_blob;
+uint8_t fdt_rng_seed[32];
 long kernel_size;
 bool kernel_le;
 uint64_t kernel_addr;
-- 
2.40.1

[PATCH 2/7] target/ppc: Fix timebase reset with record-replay

2023-07-26 Thread Nicholas Piggin

Timebase save uses a random number for a legacy vmstate field, which
makes rr snapshot loading unbalanced. The easiest way to deal with this
is just to skip the rng if record-replay is active.

Reviewed-by: Pavel Dovgalyuk 
Signed-off-by: Nicholas Piggin 
---
 hw/ppc/ppc.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index cd1993e9c1..2476c4c4d3 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -32,6 +32,7 @@
 #include "qemu/main-loop.h"
 #include "qemu/error-report.h"
 #include "sysemu/kvm.h"
+#include "sysemu/replay.h"
 #include "sysemu/runstate.h"
 #include "kvm_ppc.h"
 #include "migration/vmstate.h"
@@ -937,8 +938,14 @@ static void timebase_save(PPCTimebase *tb)
 return;
 }
 
-/* not used anymore, we keep it for compatibility */
-tb->time_of_the_day_ns = qemu_clock_get_ns(QEMU_CLOCK_HOST);
+if (replay_mode == REPLAY_MODE_NONE) {
+/* not used anymore, we keep it for compatibility */
+tb->time_of_the_day_ns = qemu_clock_get_ns(QEMU_CLOCK_HOST);
+} else {
+/* simpler for record-replay to avoid this event, compat not needed */
+tb->time_of_the_day_ns = 0;
+}
+
 /*
  * tb_offset is only expected to be changed by QEMU so
  * there is no need to update it from KVM here
-- 
2.40.1

Re: [PATCH v6 4/5] intel_iommu: allow Extended Interrupt Mode when using userspace APIC

2023-07-26 Thread Peter Xu

On Fri, Jul 21, 2023 at 10:35:01PM +0700, Bui Quang Minh wrote:
> On 7/21/23 03:47, Peter Xu wrote:
> > On Mon, Jul 17, 2023 at 11:29:56PM +0700, Bui Quang Minh wrote:
> > > On 7/17/23 17:47, Joao Martins wrote:
> > > > +Peter, +Jason (intel-iommu maintainer/reviewer)
> > 
> > Thanks for copying me, Joan.
> > 
> > > > 
> > > > On 15/07/2023 16:22, Bui Quang Minh wrote:
> > > > > As userspace APIC now supports x2APIC, intel interrupt remapping
> > > > > hardware can be set to EIM mode when userspace local APIC is used.
> > > > > 
> > > > > Reviewed-by: Michael S. Tsirkin 
> > > > > Signed-off-by: Bui Quang Minh 
> > > > > ---
> > > > >hw/i386/intel_iommu.c | 11 ---
> > > > >1 file changed, 11 deletions(-)
> > > > > 
> > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > > > index dcc334060c..5e576f6059 100644
> > > > > --- a/hw/i386/intel_iommu.c
> > > > > +++ b/hw/i386/intel_iommu.c
> > > > > @@ -4043,17 +4043,6 @@ static bool vtd_decide_config(IntelIOMMUState 
> > > > > *s, Error **errp)
> > > > >  && x86_iommu_ir_supported(x86_iommu) ?
> > > > >  ON_OFF_AUTO_ON : 
> > > > > ON_OFF_AUTO_OFF;
> > > > >}
> > > > > -if (s->intr_eim == ON_OFF_AUTO_ON && !s->buggy_eim) {
> > > > > -if (!kvm_irqchip_is_split()) {
> > > > > -error_setg(errp, "eim=on requires 
> > > > > accel=kvm,kernel-irqchip=split");
> > > > > -return false;
> > > > > -}
> > > > > -if (!kvm_enable_x2apic()) {
> > > > > -error_setg(errp, "eim=on requires support on the KVM 
> > > > > side"
> > > > > - "(X2APIC_API, first shipped in v4.7)");
> > > > > -return false;
> > > > > -}
> > > > > -}
> > > > Given commit 20ca47429e ('Revert "intel_iommu: Fix irqchip / X2APIC
> > > > configuration checks"'), won't we regress behaviour again  for the 
> > > > accel=kvm
> > > > case by dropping the kvm_enable_x2apic() call here?
> > > > 
> > > > Perhaps if we support userspace APIC with TCG the check just needs to 
> > > > be redone
> > > > to instead avoid always requiring kvm e.g.:
> > > > 
> > > > if (kvm_irqchip_in_kernel()) {
> > > >   error_setg(errp, "eim=on requires accel=kvm,kernel-irqchip=split"
> > > >  "(X2APIC_API, first shipped in v4.7)");
> > > > }
> > > > 
> > > > if (kvm_irqchip_is_split() && !kvm_enable_x2apic()) {
> > > >   error_setg(errp, "eim=on requires support on the KVM side"
> > > >  "(X2APIC_API, first shipped in v4.7)");
> > > >   return false;
> > > > }
> > > 
> > > Thank you for your review. I think the check for kvm_irqchip_in_kernel() 
> > > is
> > > not correct, AFAIK, kvm_irqchip_is_split() == true also means
> > > kvm_irqchip_in_kernel() == true on x86. To check if kernel-irqchip = on, 
> > > we
> > > need to do something like in x86_iommu_realize
> > > 
> > >  bool irq_all_kernel = kvm_irqchip_in_kernel() &&
> > > !kvm_irqchip_is_split();
> > > 
> > > The original check for !kvm_irqchip_is_split means emulated/userspace 
> > > APIC.
> > > It's because to reach that check x86_iommu_ir_supported(...) == true and
> > > x86_iommu_ir_supported(...) == true is not supported when kernel-irqchip =
> > > on (there is a check for this in x86_iommu_realize)
> > > 
> > > So I think we need to change the check to
> > > 
> > >  if (s->intr_eim == ON_OFF_AUTO_ON && !s->buggy_eim) {
> > >  if (kvm_irqchip_is_split() && !kvm_enable_x2apic()) {
> > >  error_setg(errp, "eim=on requires support on the KVM side"
> > >   "(X2APIC_API, first shipped in v4.7)");
> > >  return false;
> > >  }
> > >  }
> > > 
> > > Is it OK?
> > 
> > Mostly to me, except that we may also want to keep failing if all irq chips
> > are in kernel?
> 
> Yes, that behavior does not change after this patch. x86_iommu_realize in
> the parent type TYPE_X86_IOMMU_DEVICE fails when interrupt remapping is on
> and all irq chips are in kernel already.
> 
> static void x86_iommu_realize(DeviceState *dev, Error **errp)
> {
> /* ... */
> /* Both Intel and AMD IOMMU IR only support "kernel-irqchip
> {off|split}" */
> if (x86_iommu_ir_supported(x86_iommu) && irq_all_kernel) {
> error_setg(errp, "Interrupt Remapping cannot work with "
>  "kernel-irqchip=on, please use 'split|off'.");
> return;
> }
> /* ... */
> }
> 
> 
> So in case we reach here in with the interrupt remapping is on and decide
> whether eim is on or not, it cannot be that irq chips are all in kernel.

Ah that's okay then, thanks.

-- 
Peter Xu

[PATCH 4/6] hw/ppc: Avoid decrementer rounding errors

2023-07-26 Thread Nicholas Piggin

The decrementer register contains a relative time in timebase units.
When writing to DECR this is converted and stored as an absolute value
in nanosecond units, reading DECR converts back to relative timebase.

The tb<->ns conversion of the relative part can cause rounding such that
a value writen to the decrementer can read back a different, with time
held constant. This is a particular problem for a deterministic icount
and record-replay trace.

Fix this by storing the absolute value in timebase units rather than
nanoseconds. The math before:
  store:  decr_next = now_ns + decr * ns_per_sec / tb_per_sec
  load:decr = (decr_next - now_ns) * tb_per_sec / ns_per_sec
  load(store): decr = decr * ns_per_sec / tb_per_sec * tb_per_sec /
  ns_per_sec

After:
  store:  decr_next = now_ns * tb_per_sec / ns_per_sec + decr
  load:decr = decr_next - now_ns * tb_per_sec / ns_per_sec
  load(store): decr = decr

Fixes: 9fddaa0c0cab ("PowerPC merge: real time TB and decrementer - faster and 
simpler exception handling (Jocelyn Mayer)")
Signed-off-by: Nicholas Piggin 
---
 hw/ppc/ppc.c | 41 -
 1 file changed, 24 insertions(+), 17 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index 0e0a3d93c3..fa60f76dd4 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -686,16 +686,17 @@ bool ppc_decr_clear_on_delivery(CPUPPCState *env)
 static inline int64_t _cpu_ppc_load_decr(CPUPPCState *env, uint64_t next)
 {
 ppc_tb_t *tb_env = env->tb_env;
-int64_t decr, diff;
+uint64_t now, n;
+int64_t decr;
 
-diff = next - qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-if (diff >= 0) {
-decr = muldiv64(diff, tb_env->decr_freq, NANOSECONDS_PER_SECOND);
-} else if (tb_env->flags & PPC_TIMER_BOOKE) {
+now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+n = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND);
+if (next > n && tb_env->flags & PPC_TIMER_BOOKE) {
 decr = 0;
-}  else {
-decr = -muldiv64(-diff, tb_env->decr_freq, NANOSECONDS_PER_SECOND);
+} else {
+decr = next - n;
 }
+
 trace_ppc_decr_load(decr);
 
 return decr;
@@ -834,11 +835,11 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, 
uint64_t *nextp,
 
 /* Calculate the next timer event */
 now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-next = now + muldiv64(value, NANOSECONDS_PER_SECOND, tb_env->decr_freq);
-*nextp = next;
+next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value;
+*nextp = next; /* nextp is in timebase units */
 
 /* Adjust timer */
-timer_mod(timer, next);
+timer_mod(timer, muldiv64(next, NANOSECONDS_PER_SECOND, 
tb_env->decr_freq));
 }
 
 static inline void _cpu_ppc_store_decr(PowerPCCPU *cpu, target_ulong decr,
@@ -1153,14 +1154,20 @@ static void start_stop_pit (CPUPPCState *env, ppc_tb_t 
*tb_env, int is_excp)
 } else {
 trace_ppc4xx_pit_start(ppc40x_timer->pit_reload);
 now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-next = now + muldiv64(ppc40x_timer->pit_reload,
-  NANOSECONDS_PER_SECOND, tb_env->decr_freq);
-if (is_excp)
-next += tb_env->decr_next - now;
-if (next == now)
-next++;
+
+if (is_excp) {
+tb_env->decr_next += ppc40x_timer->pit_reload;
+} else {
+tb_env->decr_next = muldiv64(now, tb_env->decr_freq,
+ NANOSECONDS_PER_SECOND)
++ ppc40x_timer->pit_reload;
+}
+next = muldiv64(tb_env->decr_next, NANOSECONDS_PER_SECOND,
+tb_env->decr_freq);
+if (next <= now) {
+next = now + 1;
+}
 timer_mod(tb_env->decr_timer, next);
-tb_env->decr_next = next;
 }
 }
 
-- 
2.40.1

Re: [PATCH v1] migration: refactor migration_completion

2023-07-26 Thread Peter Xu

On Fri, Jul 21, 2023 at 11:14:55AM +, Wang, Wei W wrote:
> On Friday, July 21, 2023 4:38 AM, Peter Xu wrote:
> > Looks good to me, after addressing Isaku's comments.
> > 
> > The current_active_state is very unfortunate, along with most of the calls 
> > to
> > migrate_set_state() - I bet most of the code will definitely go wrong if 
> > that
> > cmpxchg didn't succeed inside of migrate_set_state(), IOW in most cases we
> > simply always want:
> 
> Can you share examples where it could be wrong?
> (If it has bugs, we need to fix)

Nop.  What I meant is most of the cases we want to set the state without
caring much about the old state, so at least we can have a helper like
below and simply call migrate_set_state(s, STATE) where we don't care old
state.

> 
> > 
> >   migrate_set_state(>state, s->state, XXX);
> > 
> > Not sure whether one pre-requisite patch is good to have so we can rename
> > migrate_set_state() to something like __migrate_set_state(), then:
> > 
> >   migrate_set_state(s, XXX) {
> > __migrate_set_state(>state, s->state, XXX);
> >   }
> > 
> > I don't even know whether there's any call site that will need
> > __migrate_set_state() for real..
> > 
> 
> Seems this would break the use of "MIGRATION_STATUS_CANCELLING".
> For example, 
> - In migration_maybe_pause:
> migrate_set_state(>state, MIGRATION_STATUS_PRE_SWITCHOVER,
> new_state);
> If the current s->state isn't MIGRATION_STATUS_PRE_SWITCHOVER (could
> be MIGRATION_STATUS_CANCELLING),  then s->state won’t be updated to
> new_state.
> - Then, in migration_completion, the following update to s->state won't 
> succeed:
>migrate_set_state(>state, current_active_state,
>   MIGRATION_STATUS_COMPLETED);
> 
> - Finally, when reaching migration_iteration_finish(), s->state is
> MIGRATION_STATUS_CANCELLING, instead of MIGRATION_STATUS_COMPLETED.

The whole state changes are just flaky to me in general, even with the help
of old_state cmpxchg.

E.g., I'm wondering whether below race can happen, assuming we're starting
with ACTIVE state and just about to complete migration:

  main threadmigration thread
  ---
   
migration_maybe_pause(current_active_state==ACTIVE)
 if (s->state != 
MIGRATION_STATUS_CANCELLING)
   --> true, keep setting state
   qemu_mutex_unlock_iothread();
qemu_mutex_lock_iothread();
migrate_fd_cancel()
  if (old_state == MIGRATION_STATUS_PRE_SWITCHOVER)
--> false, not posting to pause_sem
  set state to MIGRATION_STATUS_CANCELLING
  migrate_set_state(>state, 
*current_active_state,

MIGRATION_STATUS_PRE_SWITCHOVER);
--> false, cmpxchg fail
  qemu_sem_wait(>pause_sem);
--> hang death?

Thanks,

-- 
Peter Xu

[PATCH 1/6] target/ppc: Implement ASDR register for ISA v3.0 for HPT

2023-07-26 Thread Nicholas Piggin

The ASDR register was introduced in ISA v3.0. It has not been
implemented for HPT. With HPT, ASDR is the format of the slbmte RS
operand (containing VSID), which matches the ppc_slb_t field.

Fixes: 3367c62f522b ("target/ppc: Support for POWER9 native hash")
Signed-off-by: Nicholas Piggin 
---
 target/ppc/mmu-hash64.c | 27 ---
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index 900f906990..a0c90df3ce 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -770,7 +770,8 @@ static bool ppc_hash64_use_vrma(CPUPPCState *env)
 }
 }
 
-static void ppc_hash64_set_isi(CPUState *cs, int mmu_idx, uint64_t error_code)
+static void ppc_hash64_set_isi(CPUState *cs, int mmu_idx, uint64_t slb_vsid,
+   uint64_t error_code)
 {
 CPUPPCState *env = _CPU(cs)->env;
 bool vpm;
@@ -782,13 +783,15 @@ static void ppc_hash64_set_isi(CPUState *cs, int mmu_idx, 
uint64_t error_code)
 }
 if (vpm && !mmuidx_hv(mmu_idx)) {
 cs->exception_index = POWERPC_EXCP_HISI;
+env->spr[SPR_ASDR] = slb_vsid;
 } else {
 cs->exception_index = POWERPC_EXCP_ISI;
 }
 env->error_code = error_code;
 }
 
-static void ppc_hash64_set_dsi(CPUState *cs, int mmu_idx, uint64_t dar, 
uint64_t dsisr)
+static void ppc_hash64_set_dsi(CPUState *cs, int mmu_idx, uint64_t slb_vsid,
+   uint64_t dar, uint64_t dsisr)
 {
 CPUPPCState *env = _CPU(cs)->env;
 bool vpm;
@@ -802,6 +805,7 @@ static void ppc_hash64_set_dsi(CPUState *cs, int mmu_idx, 
uint64_t dar, uint64_t
 cs->exception_index = POWERPC_EXCP_HDSI;
 env->spr[SPR_HDAR] = dar;
 env->spr[SPR_HDSISR] = dsisr;
+env->spr[SPR_ASDR] = slb_vsid;
 } else {
 cs->exception_index = POWERPC_EXCP_DSI;
 env->spr[SPR_DAR] = dar;
@@ -963,13 +967,13 @@ bool ppc_hash64_xlate(PowerPCCPU *cpu, vaddr eaddr, 
MMUAccessType access_type,
 }
 switch (access_type) {
 case MMU_INST_FETCH:
-ppc_hash64_set_isi(cs, mmu_idx, SRR1_PROTFAULT);
+ppc_hash64_set_isi(cs, mmu_idx, 0, SRR1_PROTFAULT);
 break;
 case MMU_DATA_LOAD:
-ppc_hash64_set_dsi(cs, mmu_idx, eaddr, DSISR_PROTFAULT);
+ppc_hash64_set_dsi(cs, mmu_idx, 0, eaddr, DSISR_PROTFAULT);
 break;
 case MMU_DATA_STORE:
-ppc_hash64_set_dsi(cs, mmu_idx, eaddr,
+ppc_hash64_set_dsi(cs, mmu_idx, 0, eaddr,
DSISR_PROTFAULT | DSISR_ISSTORE);
 break;
 default:
@@ -1022,7 +1026,7 @@ bool ppc_hash64_xlate(PowerPCCPU *cpu, vaddr eaddr, 
MMUAccessType access_type,
 /* 3. Check for segment level no-execute violation */
 if (access_type == MMU_INST_FETCH && (slb->vsid & SLB_VSID_N)) {
 if (guest_visible) {
-ppc_hash64_set_isi(cs, mmu_idx, SRR1_NOEXEC_GUARD);
+ppc_hash64_set_isi(cs, mmu_idx, slb->vsid, SRR1_NOEXEC_GUARD);
 }
 return false;
 }
@@ -1035,13 +1039,14 @@ bool ppc_hash64_xlate(PowerPCCPU *cpu, vaddr eaddr, 
MMUAccessType access_type,
 }
 switch (access_type) {
 case MMU_INST_FETCH:
-ppc_hash64_set_isi(cs, mmu_idx, SRR1_NOPTE);
+ppc_hash64_set_isi(cs, mmu_idx, slb->vsid, SRR1_NOPTE);
 break;
 case MMU_DATA_LOAD:
-ppc_hash64_set_dsi(cs, mmu_idx, eaddr, DSISR_NOPTE);
+ppc_hash64_set_dsi(cs, mmu_idx, slb->vsid, eaddr, DSISR_NOPTE);
 break;
 case MMU_DATA_STORE:
-ppc_hash64_set_dsi(cs, mmu_idx, eaddr, DSISR_NOPTE | 
DSISR_ISSTORE);
+ppc_hash64_set_dsi(cs, mmu_idx, slb->vsid, eaddr,
+   DSISR_NOPTE | DSISR_ISSTORE);
 break;
 default:
 g_assert_not_reached();
@@ -1075,7 +1080,7 @@ bool ppc_hash64_xlate(PowerPCCPU *cpu, vaddr eaddr, 
MMUAccessType access_type,
 if (PAGE_EXEC & ~amr_prot) {
 srr1 |= SRR1_IAMR; /* Access violates virt pg class key prot */
 }
-ppc_hash64_set_isi(cs, mmu_idx, srr1);
+ppc_hash64_set_isi(cs, mmu_idx, slb->vsid, srr1);
 } else {
 int dsisr = 0;
 if (need_prot & ~pp_prot) {
@@ -1087,7 +1092,7 @@ bool ppc_hash64_xlate(PowerPCCPU *cpu, vaddr eaddr, 
MMUAccessType access_type,
 if (need_prot & ~amr_prot) {
 dsisr |= DSISR_AMR;
 }
-ppc_hash64_set_dsi(cs, mmu_idx, eaddr, dsisr);
+ppc_hash64_set_dsi(cs, mmu_idx, slb->vsid, eaddr, dsisr);
 }
 return false;
 }
-- 
2.40.1

[PATCH 2/6] target/ppc: Fix VRMA page size for ISA v3.0

2023-07-26 Thread Nicholas Piggin

Until v2.07s, the VRMA page size (L||LP) was encoded in LPCR[VRMASD].
In v3.0 that moved to the partition table PS field.

The powernv machine can now run KVM HPT guests on POWER9/10 CPUs with
this fix and the patch to add ASDR.

Fixes: 3367c62f522b ("target/ppc: Support for POWER9 native hash")
Signed-off-by: Nicholas Piggin 
---
 target/ppc/mmu-hash64.c | 41 +++--
 1 file changed, 35 insertions(+), 6 deletions(-)

diff --git a/target/ppc/mmu-hash64.c b/target/ppc/mmu-hash64.c
index a0c90df3ce..7f8db0 100644
--- a/target/ppc/mmu-hash64.c
+++ b/target/ppc/mmu-hash64.c
@@ -874,12 +874,41 @@ static target_ulong rmls_limit(PowerPCCPU *cpu)
 return rma_sizes[rmls];
 }
 
-static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
+/* Return the LLP in SLB_VSID format */
+static uint64_t get_vrma_llp(PowerPCCPU *cpu)
 {
 CPUPPCState *env = >env;
-target_ulong lpcr = env->spr[SPR_LPCR];
-uint32_t vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
-target_ulong vsid = SLB_VSID_VRMA | ((vrmasd << 4) & SLB_VSID_LLP_MASK);
+uint64_t llp;
+
+if (env->mmu_model == POWERPC_MMU_3_00) {
+ppc_v3_pate_t pate;
+uint64_t ps;
+
+/*
+ * ISA v3.0 removes the LPCR[VRMASD] field and puts the VRMA base
+ * page size (L||LP equivalent) in the PS field in the HPT partition
+ * table entry.
+ */
+if (!ppc64_v3_get_pate(cpu, cpu->env.spr[SPR_LPIDR], )) {
+error_report("Bad VRMA with no partition table entry");
+return 0;
+}
+ps = pate.dw0 >> (63 - 58);
+llp = ((ps & 0x4) << (63 - 55 - 2)) | ((ps & 0x3) << (63 - 59));
+
+} else {
+uint64_t lpcr = env->spr[SPR_LPCR];
+target_ulong vrmasd = (lpcr & LPCR_VRMASD) >> LPCR_VRMASD_SHIFT;
+
+llp = (vrmasd << 4) & SLB_VSID_LLP_MASK;
+}
+
+return llp;
+}
+
+static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
+{
+target_ulong vsid = SLB_VSID_VRMA | get_vrma_llp(cpu);
 int i;
 
 for (i = 0; i < PPC_PAGE_SIZES_MAX_SZ; i++) {
@@ -897,8 +926,8 @@ static int build_vrma_slbe(PowerPCCPU *cpu, ppc_slb_t *slb)
 }
 }
 
-error_report("Bad page size encoding in LPCR[VRMASD]; LPCR=0x"
- TARGET_FMT_lx, lpcr);
+error_report("Bad VRMA page size encoding 0x" TARGET_FMT_lx,
+ get_vrma_llp(cpu));
 
 return -1;
 }
-- 
2.40.1

[PATCH 2/2] virtio-gpu: reset gfx resources in main thread

2023-07-26 Thread marcandre . lureau

From: Marc-André Lureau 

Calling OpenGL from different threads can have bad consequences if not
carefully reviewed. It's not generally supported. In my case, I was
debugging a crash in glDeleteTextures from OPENGL32.DLL, where I asked
qemu for gl=es, and thus ANGLE implementation was expected. libepoxy did
resolution of the global pointer for glGenTexture to the GLES version
from the main thread. But it resolved glDeleteTextures to the GL
version, because it was done from a different thread without correct
context. Oops.

Let's stick to the main thread for GL calls by using a BH.

Note: I didn't use atomics for reset_finished check, assuming the BQL
will provide enough of sync, but I might be wrong.

Signed-off-by: Marc-André Lureau 
---
 include/hw/virtio/virtio-gpu.h |  3 +++
 hw/display/virtio-gpu.c| 38 +++---
 2 files changed, 34 insertions(+), 7 deletions(-)

diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 05bee09e1a..390c4642b8 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -169,6 +169,9 @@ struct VirtIOGPU {
 
 QEMUBH *ctrl_bh;
 QEMUBH *cursor_bh;
+QEMUBH *reset_bh;
+QemuCond reset_cond;
+bool reset_finished;
 
 QTAILQ_HEAD(, virtio_gpu_simple_resource) reslist;
 QTAILQ_HEAD(, virtio_gpu_ctrl_command) cmdq;
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index b1f5d392bb..bbd5c6561a 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -14,6 +14,7 @@
 #include "qemu/osdep.h"
 #include "qemu/units.h"
 #include "qemu/iov.h"
+#include "sysemu/cpus.h"
 #include "ui/console.h"
 #include "trace.h"
 #include "sysemu/dma.h"
@@ -41,6 +42,7 @@ virtio_gpu_find_check_resource(VirtIOGPU *g, uint32_t 
resource_id,
 
 static void virtio_gpu_cleanup_mapping(VirtIOGPU *g,
struct virtio_gpu_simple_resource *res);
+static void virtio_gpu_reset_bh(void *opaque);
 
 void virtio_gpu_update_cursor_data(VirtIOGPU *g,
struct virtio_gpu_scanout *s,
@@ -1387,6 +1389,8 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error 
**errp)
  >mem_reentrancy_guard);
 g->cursor_bh = qemu_bh_new_guarded(virtio_gpu_cursor_bh, g,
>mem_reentrancy_guard);
+g->reset_bh = qemu_bh_new(virtio_gpu_reset_bh, g);
+qemu_cond_init(>reset_cond);
 QTAILQ_INIT(>reslist);
 QTAILQ_INIT(>cmdq);
 QTAILQ_INIT(>fenceq);
@@ -1398,20 +1402,44 @@ static void virtio_gpu_device_unrealize(DeviceState 
*qdev)
 
 g_clear_pointer(>ctrl_bh, qemu_bh_delete);
 g_clear_pointer(>cursor_bh, qemu_bh_delete);
+g_clear_pointer(>reset_bh, qemu_bh_delete);
+qemu_cond_destroy(>reset_cond);
 virtio_gpu_base_device_unrealize(qdev);
 }
 
-void virtio_gpu_reset(VirtIODevice *vdev)
+static void virtio_gpu_reset_bh(void *opaque)
 {
-VirtIOGPU *g = VIRTIO_GPU(vdev);
+VirtIOGPU *g = VIRTIO_GPU(opaque);
 struct virtio_gpu_simple_resource *res, *tmp;
-struct virtio_gpu_ctrl_command *cmd;
 int i = 0;
 
 QTAILQ_FOREACH_SAFE(res, >reslist, next, tmp) {
 virtio_gpu_resource_destroy(g, res);
 }
 
+for (i = 0; i < g->parent_obj.conf.max_outputs; i++) {
+dpy_gfx_replace_surface(g->parent_obj.scanout[i].con, NULL);
+}
+
+g->reset_finished = true;
+qemu_cond_signal(>reset_cond);
+}
+
+void virtio_gpu_reset(VirtIODevice *vdev)
+{
+VirtIOGPU *g = VIRTIO_GPU(vdev);
+struct virtio_gpu_ctrl_command *cmd;
+
+if (qemu_in_vcpu_thread()) {
+g->reset_finished = false;
+qemu_bh_schedule(g->reset_bh);
+while (!g->reset_finished) {
+qemu_cond_wait_iothread(>reset_cond);
+}
+} else {
+virtio_gpu_reset_bh(g);
+}
+
 while (!QTAILQ_EMPTY(>cmdq)) {
 cmd = QTAILQ_FIRST(>cmdq);
 QTAILQ_REMOVE(>cmdq, cmd, next);
@@ -1425,10 +1453,6 @@ void virtio_gpu_reset(VirtIODevice *vdev)
 g_free(cmd);
 }
 
-for (i = 0; i < g->parent_obj.conf.max_outputs; i++) {
-dpy_gfx_replace_surface(g->parent_obj.scanout[i].con, NULL);
-}
-
 virtio_gpu_base_reset(VIRTIO_GPU_BASE(vdev));
 }
 
-- 
2.41.0

[PATCH 5/6] hw/ppc: Always store the decrementer value

2023-07-26 Thread Nicholas Piggin

When writing a value to the decrementer that raises an exception, the
irq is raised, but the value is not stored so the store doesn't appear
to have changed the register when it is read again.

Always store the write value to the register.

Fixes: e81a982aa53 ("PPC: Clean up DECR implementation")
Signed-off-by: Nicholas Piggin 
---
 hw/ppc/ppc.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index fa60f76dd4..cd1993e9c1 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -812,6 +812,11 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t 
*nextp,
 return;
 }
 
+/* Calculate the next timer event */
+now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value;
+*nextp = next; /* nextp is in timebase units */
+
 /*
  * Going from 1 -> 0 or 0 -> -1 is the event to generate a DEC interrupt.
  *
@@ -833,11 +838,6 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t 
*nextp,
 (*lower_excp)(cpu);
 }
 
-/* Calculate the next timer event */
-now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value;
-*nextp = next; /* nextp is in timebase units */
-
 /* Adjust timer */
 timer_mod(timer, muldiv64(next, NANOSECONDS_PER_SECOND, 
tb_env->decr_freq));
 }
-- 
2.40.1

[PATCH 1/7] target/ppc: Fix CPU reservation migration for record-replay

2023-07-26 Thread Nicholas Piggin

ppc only migrates reserve_addr, so the destination machine can get a
valid reservation with an incorrect reservation value of 0. Prior to
commit 392d328abe753 ("target/ppc: Ensure stcx size matches larx"),
this could permit a stcx. to incorrectly succeed. That commit
inadvertently fixed that bug because the target machine starts with an
impossible reservation size of 0, so any stcx. will fail.

This behaviour is permitted by the ISA because reservation loss may
have implementation-dependent cause. What's more, with KVM machines it
is impossible save or reasonably restore reservation state. However if
the vmstate is being used for record-replay, the reservation must be
saved and restored exactly in order for execution from snapshot to
match the record.

This patch deprecates the existing incomplete reserve_addr vmstate,
and adds a new vmstate subsection with complete reservation state.
The new vmstate is needed only when record-replay mode is active.

Acked-by: Pavel Dovgalyuk 
Signed-off-by: Nicholas Piggin 
---
 target/ppc/cpu.h   |  2 ++
 target/ppc/machine.c   | 26 --
 target/ppc/translate.c |  4 
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 25fac9577a..27532d8f81 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1121,7 +1121,9 @@ struct CPUArchState {
 target_ulong reserve_addr;   /* Reservation address */
 target_ulong reserve_length; /* Reservation larx op size (bytes) */
 target_ulong reserve_val;/* Reservation value */
+#if defined(TARGET_PPC64)
 target_ulong reserve_val2;
+#endif
 
 /* These are used in supervisor mode only */
 target_ulong msr;  /* machine state register */
diff --git a/target/ppc/machine.c b/target/ppc/machine.c
index ebb37e07d0..9f956b972c 100644
--- a/target/ppc/machine.c
+++ b/target/ppc/machine.c
@@ -10,6 +10,7 @@
 #include "qemu/main-loop.h"
 #include "kvm_ppc.h"
 #include "power8-pmu.h"
+#include "sysemu/replay.h"
 
 static void post_load_update_msr(CPUPPCState *env)
 {
@@ -685,6 +686,27 @@ static const VMStateDescription vmstate_compat = {
 }
 };
 
+static bool reservation_needed(void *opaque)
+{
+return (replay_mode != REPLAY_MODE_NONE);
+}
+
+static const VMStateDescription vmstate_reservation = {
+.name = "cpu/reservation",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = reservation_needed,
+.fields = (VMStateField[]) {
+VMSTATE_UINTTL(env.reserve_addr, PowerPCCPU),
+VMSTATE_UINTTL(env.reserve_length, PowerPCCPU),
+VMSTATE_UINTTL(env.reserve_val, PowerPCCPU),
+#if defined(TARGET_PPC64)
+VMSTATE_UINTTL(env.reserve_val2, PowerPCCPU),
+#endif
+VMSTATE_END_OF_LIST()
+}
+};
+
 const VMStateDescription vmstate_ppc_cpu = {
 .name = "cpu",
 .version_id = 5,
@@ -706,8 +728,7 @@ const VMStateDescription vmstate_ppc_cpu = {
 VMSTATE_UINTTL_ARRAY(env.spr, PowerPCCPU, 1024),
 VMSTATE_UINT64(env.spe_acc, PowerPCCPU),
 
-/* Reservation */
-VMSTATE_UINTTL(env.reserve_addr, PowerPCCPU),
+VMSTATE_UNUSED(sizeof(target_ulong)), /* was env.reserve_addr */
 
 /* Supervisor mode architected state */
 VMSTATE_UINTTL(env.msr, PowerPCCPU),
@@ -736,6 +757,7 @@ const VMStateDescription vmstate_ppc_cpu = {
 _tlbemb,
 _tlbmas,
 _compat,
+_reservation,
 NULL
 }
 };
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index e6a0709066..b88c00b4b0 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -77,7 +77,9 @@ static TCGv cpu_xer, cpu_so, cpu_ov, cpu_ca, cpu_ov32, 
cpu_ca32;
 static TCGv cpu_reserve;
 static TCGv cpu_reserve_length;
 static TCGv cpu_reserve_val;
+#if defined(TARGET_PPC64)
 static TCGv cpu_reserve_val2;
+#endif
 static TCGv cpu_fpscr;
 static TCGv_i32 cpu_access_type;
 
@@ -151,9 +153,11 @@ void ppc_translate_init(void)
 cpu_reserve_val = tcg_global_mem_new(cpu_env,
  offsetof(CPUPPCState, reserve_val),
  "reserve_val");
+#if defined(TARGET_PPC64)
 cpu_reserve_val2 = tcg_global_mem_new(cpu_env,
   offsetof(CPUPPCState, reserve_val2),
   "reserve_val2");
+#endif
 
 cpu_fpscr = tcg_global_mem_new(cpu_env,
offsetof(CPUPPCState, fpscr), "fpscr");
-- 
2.40.1

[PATCH 0/2] virtio-gpu: reset gfx resources in main thread

2023-07-26 Thread marcandre . lureau

From: Marc-André Lureau 

Hi,

See the second patch for details.
thanks

Marc-André Lureau (2):
  virtio-gpu: free BHs, by implementing unrealize
  virtio-gpu: reset gfx resources in main thread

 include/hw/virtio/virtio-gpu.h |  4 +++
 hw/display/virtio-gpu-base.c   |  2 +-
 hw/display/virtio-gpu.c| 48 +-
 3 files changed, 46 insertions(+), 8 deletions(-)

-- 
2.41.0

[PATCH 3/7] spapr: Fix machine reset deadlock from replay-record

2023-07-26 Thread Nicholas Piggin

When the machine is reset to load a new snapshot while being debugged
with replay-record, it is done from another thread, so the CPU does
not run the register setting operations. Set CPU registers directly in
machine reset.

Cc: Pavel Dovgalyuk 
Signed-off-by: Nicholas Piggin 
---
 hw/ppc/spapr.c | 20 ++--
 include/hw/ppc/spapr.h |  1 +
 target/ppc/compat.c| 19 +++
 target/ppc/cpu.h   |  1 +
 4 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 1c8b8d57a7..7d84244f03 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1322,6 +1322,22 @@ void spapr_set_all_lpcrs(target_ulong value, 
target_ulong mask)
 }
 }
 
+/* May be used when the machine is not running */
+void spapr_init_all_lpcrs(target_ulong value, target_ulong mask)
+{
+CPUState *cs;
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+CPUPPCState *env = >env;
+target_ulong lpcr;
+
+lpcr = env->spr[SPR_LPCR];
+lpcr &= ~(LPCR_HR | LPCR_UPRT);
+ppc_store_lpcr(cpu, lpcr);
+}
+}
+
+
 static bool spapr_get_pate(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
target_ulong lpid, ppc_v3_pate_t *entry)
 {
@@ -1583,7 +1599,7 @@ int spapr_reallocate_hpt(SpaprMachineState *spapr, int 
shift, Error **errp)
 }
 /* We're setting up a hash table, so that means we're not radix */
 spapr->patb_entry = 0;
-spapr_set_all_lpcrs(0, LPCR_HR | LPCR_UPRT);
+spapr_init_all_lpcrs(0, LPCR_HR | LPCR_UPRT);
 return 0;
 }
 
@@ -1661,7 +1677,7 @@ static void spapr_machine_reset(MachineState *machine, 
ShutdownCause reason)
 spapr_ovec_cleanup(spapr->ov5_cas);
 spapr->ov5_cas = spapr_ovec_new();
 
-ppc_set_compat_all(spapr->max_compat_pvr, _fatal);
+ppc_init_compat_all(spapr->max_compat_pvr, _fatal);
 
 /*
  * This is fixing some of the default configuration of the XIVE
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 538b2dfb89..f47e8419a5 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -1012,6 +1012,7 @@ bool spapr_check_pagesize(SpaprMachineState *spapr, 
hwaddr pagesize,
 #define SPAPR_OV5_XIVE_BOTH 0x80 /* Only to advertise on the platform */
 
 void spapr_set_all_lpcrs(target_ulong value, target_ulong mask);
+void spapr_init_all_lpcrs(target_ulong value, target_ulong mask);
 hwaddr spapr_get_rtas_addr(void);
 bool spapr_memory_hot_unplug_supported(SpaprMachineState *spapr);
 
diff --git a/target/ppc/compat.c b/target/ppc/compat.c
index 7949a24f5a..ebef2cccec 100644
--- a/target/ppc/compat.c
+++ b/target/ppc/compat.c
@@ -229,6 +229,25 @@ int ppc_set_compat_all(uint32_t compat_pvr, Error **errp)
 return 0;
 }
 
+/* To be used when the machine is not running */
+int ppc_init_compat_all(uint32_t compat_pvr, Error **errp)
+{
+CPUState *cs;
+
+CPU_FOREACH(cs) {
+PowerPCCPU *cpu = POWERPC_CPU(cs);
+int ret;
+
+ret = ppc_set_compat(cpu, compat_pvr, errp);
+
+if (ret < 0) {
+return ret;
+}
+}
+
+return 0;
+}
+
 int ppc_compat_max_vthreads(PowerPCCPU *cpu)
 {
 const CompatInfo *compat = compat_by_pvr(cpu->compat_pvr);
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index 27532d8f81..40be0c362a 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1497,6 +1497,7 @@ int ppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr, 
Error **errp);
 
 #if !defined(CONFIG_USER_ONLY)
 int ppc_set_compat_all(uint32_t compat_pvr, Error **errp);
+int ppc_init_compat_all(uint32_t compat_pvr, Error **errp);
 #endif
 int ppc_compat_max_vthreads(PowerPCCPU *cpu);
 void ppc_compat_add_property(Object *obj, const char *name,
-- 
2.40.1

Re: [PATCH v10 09/10] migration: Implement MigrateChannelList to hmp migration flow.

2023-07-26 Thread Daniel P . Berrangé

On Wed, Jul 26, 2023 at 10:08:05PM +0530, Het Gala wrote:
> 
> On 26/07/23 8:25 pm, Daniel P. Berrangé wrote:
> > On Wed, Jul 26, 2023 at 02:18:32PM +, Het Gala wrote:
> > > Integrate MigrateChannelList with all transport backends
> > > (socket, exec and rdma) for both src and dest migration
> > > endpoints for hmp migration.
> > > 
> > > Suggested-by: Aravind Retnakaran 
> > > Signed-off-by: Het Gala 
> > > ---
> > >   migration/migration-hmp-cmds.c | 15 +--
> > >   migration/migration.c  |  5 ++---
> > >   migration/migration.h  |  3 ++-
> > >   3 files changed, 17 insertions(+), 6 deletions(-)
> > Reviewed-by: Daniel P. Berrangé 
> > 
> Daniel, I have got reviewed-by / Acked-by from Markus (patch 1 & 6) and you
> (remaining patches). What should be the next steps here ? Do I need to send
> a new patchset / wait for more maintainers for their reviews ? Please
> advice.

At this point it is for Juan (as the migration primary maintainer) to
either queue it, or send you more review comments.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH 0/7] ppc: record-replay fixes and enablement

2023-07-26 Thread Nicholas Piggin

Here is a series that gets ppc pseries and powernv machines into
better shape for record-replay, maybe for 8.2. It's likely got a
few deficiencies but it does run test cases and helped find bugs
in migration already. It requires previous decrementer fixes to
work well.

I think we can get away without patch 1 for 8.1, because we
inadvertently fixed regular (non-rr) migration of reservation
with  commit 392d328abe753. But opinions welcome.

For record/replay and avocado test reviewers, I would mainly
be interested in opinions about patch 6. I tried not to affect
existing archs much.

Thanks,
Nick

Nicholas Piggin (7):
  target/ppc: Fix CPU reservation migration for record-replay
  target/ppc: Fix timebase reset with record-replay
  spapr: Fix machine reset deadlock from replay-record
  spapr: Fix record-replay machine reset consuming too many events
  tests/avocado: boot ppc64 pseries replay-record test to Linux VFS
mount
  tests/avocado: reverse-debugging cope with re-executing breakpoints
  tests/avocado: ppc64 reverse debugging tests for pseries and powernv

 hw/ppc/ppc.c   | 11 --
 hw/ppc/spapr.c | 32 +++---
 include/hw/ppc/spapr.h |  2 ++
 target/ppc/compat.c| 19 +++
 target/ppc/cpu.h   |  3 ++
 target/ppc/machine.c   | 26 --
 target/ppc/translate.c |  4 +++
 tests/avocado/replay_kernel.py |  3 +-
 tests/avocado/reverse_debugging.py | 54 +++---
 9 files changed, 139 insertions(+), 15 deletions(-)

-- 
2.40.1

Re: [PATCH] gitlab: remove duplication between msys jobs

2023-07-26 Thread Marc-André Lureau

Hi

On Wed, Jul 26, 2023 at 10:21 PM Thomas Huth  wrote:
>
> On 26/07/2023 18.19, Daniel P. Berrangé wrote:
> > Although they share a common parent, the two msys jobs still have
> > massive duplication in their script definitions that can easily be
> > collapsed.
> >
> > Signed-off-by: Daniel P. Berrangé 
> > ---
> >   .gitlab-ci.d/windows.yml | 132 +++
> >   1 file changed, 49 insertions(+), 83 deletions(-)
>
> We originally had different sets of packages in the 32-bit and 64-bit jobs,
> to distribute the load between the two jobs ... but it got unified in commit
> 14547e0877f3522. Now considering that we are facing timeouts again, we
> should maybe rather revert that commit instead of unifying the lists forever?
>
> Anyway, before we unify the compiler package name suffix between the two
> jobs, I really would like to see whether the mingw Clang builds QEMU faster
> in the 64-bit job ... but so far I failed to convince meson to accept the
> Clang from the mingw package ... does anybody know how to use Clang with
> MSYS2 properly?

I checked it this week (because of bug #1782), and it compiled
successfully. Although I think we may have some issues with clang on
windows, as it doesn't pack struct the expected way. See also:
https://discourse.llvm.org/t/how-to-undo-the-effect-of-mms-bitfields/72271.

It may be a good idea to add some extra static checks about our packed
struct padding expectations..

Eh, it didn't feel much faster to compile with clang :)

[PATCH 0/6] ppc fixes possibly for 8.1

2023-07-26 Thread Nicholas Piggin

Sorry for the delay following up on the fixes, I got sucked down
the decrementer rabbit hole that took longer than expected.

Question about what is suitable for merge at this time and what
should be stable. The first 3 have caused crashes or hangs running
Linux and other software. Second 3 fix some issues with dec that
could cause problems, especially with migration. But they affect
more machines in more complex ways than the first 3.

No changes to the first 3 already posted except to add Fixes:
tags.

Thanks,
Nick

Nicholas Piggin (6):
  target/ppc: Implement ASDR register for ISA v3.0 for HPT
  target/ppc: Fix VRMA page size for ISA v3.0
  target/ppc: Fix pending HDEC when entering PM state
  hw/ppc: Avoid decrementer rounding errors
  hw/ppc: Always store the decrementer value
  target/ppc: Migrate DECR SPR

 hw/ppc/ppc.c | 47 +++
 target/ppc/excp_helper.c |  6 
 target/ppc/machine.c | 14 +
 target/ppc/mmu-hash64.c  | 68 ++--
 4 files changed, 98 insertions(+), 37 deletions(-)

-- 
2.40.1

[PATCH for-8.1] target/arm: Fix MemOp for STGP

2023-07-26 Thread Richard Henderson

When converting to decodetree, the code to rebuild mop for the pair
only made it into trans_STP and not into trans_STGP.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1790
Fixes: 8c212eb6594 ("target/arm: Convert load/store-pair to decodetree")
Signed-off-by: Richard Henderson 
---
 target/arm/tcg/translate-a64.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index ef0c47407a..5fa1257d32 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -3004,6 +3004,9 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
 MemOp mop;
 TCGv_i128 tmp;
 
+/* STGP only comes in one size. */
+tcg_debug_assert(a->sz == MO_64);
+
 if (!dc_isar_feature(aa64_mte_insn_reg, s)) {
 return false;
 }
@@ -3029,13 +3032,25 @@ static bool trans_STGP(DisasContext *s, arg_ldstpair *a)
 gen_helper_stg(cpu_env, dirty_addr, dirty_addr);
 }
 
-mop = finalize_memop(s, a->sz);
-clean_addr = gen_mte_checkN(s, dirty_addr, true, false, 2 << a->sz, mop);
+mop = finalize_memop(s, MO_64);
+clean_addr = gen_mte_checkN(s, dirty_addr, true, false, 2 << MO_64, mop);
 
 tcg_rt = cpu_reg(s, a->rt);
 tcg_rt2 = cpu_reg(s, a->rt2);
 
-assert(a->sz == 3);
+/*
+ * STGP is defined as two 8-byte memory operations and one tag operation.
+ * We implement it as one single 16-byte memory operation for convenience.
+ * Rebuild mop as for STP.
+ * TODO: The atomicity with LSE2 is stronger than required.
+ * Need a form of MO_ATOM_WITHIN16_PAIR that never requires
+ * 16-byte atomicity.
+ */
+mop = MO_128;
+if (s->align_mem) {
+mop |= MO_ALIGN_8;
+}
+mop = finalize_memop_pair(s, mop);
 
 tmp = tcg_temp_new_i128();
 if (s->be_data == MO_LE) {
-- 
2.34.1

Re: [PATCH 1/1] qemu-nbd: regression with arguments passing into nbd_client_thread()

2023-07-26 Thread Eric Blake

On Wed, Jul 26, 2023 at 04:52:47PM +0200, Denis V. Lunev wrote:
> Unfortunately
> commit 03b67621445d601c9cdc7dfe25812e9f19b81488
> Author: Denis V. Lunev 
> Date:   Mon Jul 17 16:55:40 2023 +0200
> qemu-nbd: pass structure into nbd_client_thread instead of plain char*
> has introduced a regression. struct NbdClientOpts resides on stack inside
> 'if' block. This specifically means that this stack space could be reused
> once the execution will leave that block of the code.
> 
> This means that parameters passed into nbd_client_thread could be
> overwritten at any moment.
> 
> The patch moves the data to the namespace of main() function effectively
> preserving it for the whole process lifetime.
> 
> Signed-off-by: Denis V. Lunev 
> CC: Eric Blake 
> CC: Vladimir Sementsov-Ogievskiy 
> CC: 
> ---
>  qemu-nbd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 5b2757920c..7a15085ade 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -589,6 +589,7 @@ int main(int argc, char **argv)
>  const char *pid_file_name = NULL;
>  const char *selinux_label = NULL;
>  BlockExportOptions *export_opts;
> +struct NbdClientOpts opts;
>  
>  #ifdef CONFIG_POSIX
>  os_setup_early_signal_handling();
> @@ -1145,7 +1146,7 @@ int main(int argc, char **argv)
>  if (device) {
>  #if HAVE_NBD_DEVICE
>  int ret;
> -struct NbdClientOpts opts = {
> +opts = (struct NbdClientOpts) {

Does this case a compiler warning for an unused variable when
HAVE_NBD_DEVICE is not set?  If so, the solution is to also wrap the
declaration in the same #if.  I'll see if I can figure out the CI
enough to prove (or disprove) my theory on a BSD machine which likely
lacks HAVE_NBD_DEVICE.

With that addressed, I'm fine with:

Reviewed-by: Eric Blake 

and I will queue it through my NBD tree in time for 8.1-rc2.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org

[PATCH 7/7] tests/avocado: ppc64 reverse debugging tests for pseries and powernv

2023-07-26 Thread Nicholas Piggin

These machines run reverse-debugging well enough to pass basic tests.
Wire them up.

Cc: Pavel Dovgalyuk 
Signed-off-by: Nicholas Piggin 
---
 tests/avocado/reverse_debugging.py | 29 +
 1 file changed, 29 insertions(+)

diff --git a/tests/avocado/reverse_debugging.py 
b/tests/avocado/reverse_debugging.py
index 7d1a478df1..fc47874eda 100644
--- a/tests/avocado/reverse_debugging.py
+++ b/tests/avocado/reverse_debugging.py
@@ -233,3 +233,32 @@ def test_aarch64_virt(self):
 
 self.reverse_debugging(
 args=('-kernel', kernel_path))
+
+class ReverseDebugging_ppc64(ReverseDebugging):
+"""
+:avocado: tags=accel:tcg
+"""
+
+REG_PC = 0x40
+
+# unidentified gitlab timeout problem
+@skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
+def test_ppc64_pseries(self):
+"""
+:avocado: tags=arch:ppc64
+:avocado: tags=machine:pseries
+"""
+# SLOF branches back to its entry point, which causes this test
+# to take the 'hit a breakpoint again' path. That's not a problem,
+# just slightly different than the other machines.
+self.endian_is_le = False
+self.reverse_debugging()
+
+@skipIf(os.getenv('GITLAB_CI'), 'Running on GitLab')
+def test_ppc64_powernv(self):
+"""
+:avocado: tags=arch:ppc64
+:avocado: tags=machine:powernv
+"""
+self.endian_is_le = False
+self.reverse_debugging()
-- 
2.40.1

[PATCH 6/7] tests/avocado: reverse-debugging cope with re-executing breakpoints

2023-07-26 Thread Nicholas Piggin

The reverse-debugging test creates a trace, then replays it and:

1. Steps the first 10 instructions and records their addresses.
2. Steps backward and verifies their addresses match.
3. Runs to (near) the end of the trace.
4. Sets breakpoints on the first 10 instructions.
5. Continues backward and verifies execution stops at the last
   breakpoint.

Step 5 breaks if any of the other 9 breakpoints are re-executed in the
trace after the 10th instruction is run, because those will be
unexpectedly hit when reverse continuing. This situation does arise
with the ppc pseries machine, the SLOF bios branches to its own entry
point.

Permit this breakpoint re-execution by switching steps 4 and 5, so that
the trace will be run to the end *or* the next breakpoint hit.
Reversing from there to the 10th intsruction will not hit another
breakpoint, by definition.

Another step is added between steps 2 and 3, which steps forward over
the first 10 instructions and verifies their addresses, to support this.

Cc: Pavel Dovgalyuk 
Signed-off-by: Nicholas Piggin 
---
 tests/avocado/reverse_debugging.py | 25 +
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/tests/avocado/reverse_debugging.py 
b/tests/avocado/reverse_debugging.py
index 680c314cfc..7d1a478df1 100644
--- a/tests/avocado/reverse_debugging.py
+++ b/tests/avocado/reverse_debugging.py
@@ -150,16 +150,33 @@ def reverse_debugging(self, shift=7, args=None):
 self.check_pc(g, addr)
 logger.info('found position %x' % addr)
 
-logger.info('seeking to the end (icount %s)' % (last_icount - 1))
-vm.qmp('replay-break', icount=last_icount - 1)
-# continue - will return after pausing
-g.cmd(b'c', b'T02thread:01;')
+# visit the recorded instruction in forward order
+logger.info('stepping forward')
+for addr in steps:
+self.check_pc(g, addr)
+self.gdb_step(g)
+logger.info('found position %x' % addr)
 
+# set breakpoints for the instructions just stepped over
 logger.info('setting breakpoints')
 for addr in steps:
 # hardware breakpoint at addr with len=1
 g.cmd(b'Z1,%x,1' % addr, b'OK')
 
+# this may hit a breakpoint if first instructions are executed
+# again
+logger.info('continuing execution')
+vm.qmp('replay-break', icount=last_icount - 1)
+# continue - will return after pausing
+# This could stop at the end and get a T02 return, or by
+# re-executing one of the breakpoints and get a T05 return.
+g.cmd(b'c')
+if self.vm_get_icount(vm) == last_icount - 1:
+logger.info('reached the end (icount %s)' % (last_icount - 1))
+else:
+logger.info('hit a breakpoint again at %x (icount %s)' %
+(self.get_pc(g), self.vm_get_icount(vm)))
+
 logger.info('running reverse continue to reach %x' % steps[-1])
 # reverse continue - will return after stopping at the breakpoint
 g.cmd(b'bc', b'T05thread:01;')
-- 
2.40.1

[PATCH 1/2] virtio-gpu: free BHs, by implementing unrealize

2023-07-26 Thread marcandre . lureau

From: Marc-André Lureau 

Signed-off-by: Marc-André Lureau 
---
 include/hw/virtio/virtio-gpu.h |  1 +
 hw/display/virtio-gpu-base.c   |  2 +-
 hw/display/virtio-gpu.c| 10 ++
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h
index 7ea8ae2bee..05bee09e1a 100644
--- a/include/hw/virtio/virtio-gpu.h
+++ b/include/hw/virtio/virtio-gpu.h
@@ -238,6 +238,7 @@ bool virtio_gpu_base_device_realize(DeviceState *qdev,
 VirtIOHandleOutput ctrl_cb,
 VirtIOHandleOutput cursor_cb,
 Error **errp);
+void virtio_gpu_base_device_unrealize(DeviceState *qdev);
 void virtio_gpu_base_reset(VirtIOGPUBase *g);
 void virtio_gpu_base_fill_display_info(VirtIOGPUBase *g,
 struct virtio_gpu_resp_display_info *dpy_info);
diff --git a/hw/display/virtio-gpu-base.c b/hw/display/virtio-gpu-base.c
index 7ab7d08d0a..ca1fb7b16f 100644
--- a/hw/display/virtio-gpu-base.c
+++ b/hw/display/virtio-gpu-base.c
@@ -244,7 +244,7 @@ virtio_gpu_base_set_features(VirtIODevice *vdev, uint64_t 
features)
 trace_virtio_gpu_features(((features & virgl) == virgl));
 }
 
-static void
+void
 virtio_gpu_base_device_unrealize(DeviceState *qdev)
 {
 VirtIOGPUBase *g = VIRTIO_GPU_BASE(qdev);
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index e8603d78ca..b1f5d392bb 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -1392,6 +1392,15 @@ void virtio_gpu_device_realize(DeviceState *qdev, Error 
**errp)
 QTAILQ_INIT(>fenceq);
 }
 
+static void virtio_gpu_device_unrealize(DeviceState *qdev)
+{
+VirtIOGPU *g = VIRTIO_GPU(qdev);
+
+g_clear_pointer(>ctrl_bh, qemu_bh_delete);
+g_clear_pointer(>cursor_bh, qemu_bh_delete);
+virtio_gpu_base_device_unrealize(qdev);
+}
+
 void virtio_gpu_reset(VirtIODevice *vdev)
 {
 VirtIOGPU *g = VIRTIO_GPU(vdev);
@@ -1492,6 +1501,7 @@ static void virtio_gpu_class_init(ObjectClass *klass, 
void *data)
 vgbc->gl_flushed = virtio_gpu_handle_gl_flushed;
 
 vdc->realize = virtio_gpu_device_realize;
+vdc->unrealize = virtio_gpu_device_unrealize;
 vdc->reset = virtio_gpu_reset;
 vdc->get_config = virtio_gpu_get_config;
 vdc->set_config = virtio_gpu_set_config;
-- 
2.41.0

Re: [PATCH v10 09/10] migration: Implement MigrateChannelList to hmp migration flow.

2023-07-26 Thread Het Gala




On 26/07/23 10:11 pm, Daniel P. Berrangé wrote:

On Wed, Jul 26, 2023 at 10:08:05PM +0530, Het Gala wrote:

On 26/07/23 8:25 pm, Daniel P. Berrangé wrote:

On Wed, Jul 26, 2023 at 02:18:32PM +, Het Gala wrote:

Integrate MigrateChannelList with all transport backends
(socket, exec and rdma) for both src and dest migration
endpoints for hmp migration.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
   migration/migration-hmp-cmds.c | 15 +--
   migration/migration.c  |  5 ++---
   migration/migration.h  |  3 ++-
   3 files changed, 17 insertions(+), 6 deletions(-)

Reviewed-by: Daniel P. Berrangé 


Daniel, I have got reviewed-by / Acked-by from Markus (patch 1 & 6) and you
(remaining patches). What should be the next steps here ? Do I need to send
a new patchset / wait for more maintainers for their reviews ? Please
advice.

At this point it is for Juan (as the migration primary maintainer) to
either queue it, or send you more review comments.


Ack. Thanks for your review comments on the patches.


With regards,
Daniel

Regards,
Het Gala

Re: [PATCH v2 0/7] migration: Better error handling in return path thread

2023-07-26 Thread Peter Xu

Hi, Fabiano,

Sorry to be late on responding.

On Tue, Jul 25, 2023 at 03:24:26PM -0300, Fabiano Rosas wrote:
> Fabiano Rosas  writes:
> 
> > Peter Xu  writes:
> >
> >> v2:
> >> - Patch "migration: Provide explicit error message for file shutdowns"
> >>   - Touched up qapi doc [Fabiano]
> >>   - Added Bugzilla link to commit which I didn't even notice that I was
> >> fixing a bug.. but rightfully pointed out by Laszlo.
> >>   - Moved it to the 1st patch because it fixes a bug, please consider
> >> review and merge it earlier.
> >>
> >> This is a small series that reworks error handling of postcopy return path
> >> threads.
> >>
> >> We used to contain a bunch of error_report(), converting them into
> >> error_setg() properly and deliver any of those errors to migration generic
> >> error reports (via migrate_set_error()).  Then these errors can also be
> >> observed in query-migrate after postcopy is paused.
> >>
> >> Dropped the return-path specific error reporting: mark_source_rp_bad(),
> >> because it's a duplication if we can always use migrate_set_error().
> >>
> >> Please have a look, thanks.
> >>
> >> Peter Xu (7):
> >>   migration: Display error in query-migrate irrelevant of status
> >>   migration: Let migrate_set_error() take ownership
> >>   migration: Introduce migrate_has_error()
> >>   migration: Refactor error handling in source return path
> >>   migration: Deliver return path file error to migrate state too
> >>   qemufile: Always return a verbose error
> >>   migration: Provide explicit error message for file shutdowns
> >>
> >>  qapi/migration.json  |   5 +-
> >>  migration/migration.h|   8 +-
> >>  migration/ram.h  |   5 +-
> >>  migration/channel.c  |   1 -
> >>  migration/migration.c| 168 +++
> >>  migration/multifd.c  |  10 +--
> >>  migration/postcopy-ram.c |   1 -
> >>  migration/qemu-file.c|  20 -
> >>  migration/ram.c  |  42 +-
> >>  migration/trace-events   |   2 +-
> >>  10 files changed, 149 insertions(+), 113 deletions(-)
> >
> > Hi Peter,
> >
> > Were you aiming at solving any specific bug with this series?

Nop.  I simply noticed that the error in return path cannot be proxied to
migration object and thought maybe we should do that.

Thanks for looing into this problem.

> > I'm seeing
> > a bug on master (361d5397355) with the
> > /x86_64/migration/postcopy/preempt/recovery/plain test around the areas
> > that this series touches.
> >
> > It happens very rarely and I'm still investigating, but in case you have
> > any thoughts:
> >
> > 
> > It seems there's a race condition between postcopy resume and the return
> > path cleanup.
> >
> > It is possible for open_return_path_on_source() to setup the new
> > QEMUFile *before* the cleanup path at source_return_path_thread() has
> > had a chance to run, so we end up calling migration_release_dst_files()
> > on the new file and ms->rp_state.from_dst_file gets set to NULL again,
> > leading to a SIGSEGV at qemu_file_get_error(rp) due to rp being NULL.
> 
> I did some more digging and this is indeed what happens. When we pause
> on the incoming side, the to_src_file is closed and the source return
> path sees an error (EBADFD) which leads to the cleanup (from_dst_file =
> NULL). This happens independently and without any synchronization with a
> potential concurrent resume operation.
> 
> Is there a reason for not closing the return path thread and starting a
> new one for resume?

Not anything special.  When I initially worked on that (quite a few years
ago) I thought it would be the simplest we keep hold as much things as
possible, including the threads.  But maybe it's not too hard either to
just reinitiate the thread when resume indeed.

> The from_dst_file is the only thing being changed
> anyway. It would allow us to remove the retry logic along with the
> problematic cleanup path and not need another synchronization point
> between qmp_migrate() and the return path.

I'm fine if you want to remove the return path thread when a pause happens,
as long as we can cleanly do that; if you already drafted something and
looks all good from your side, happy to read it.  We may somewhere need
another call to await_return_path_close_on_source() when pause from
migration thread.

The other way is the main thread can stupidly wait for the two files to be
released, namely, from_dst_file and postcopy_qemufile_src.

> 
> Here's the race (important bit is open_return_path happening before
> migration_release_dst_files):
> 
> migration | qmp | return path
> --+-+-
> qmp_migrate_pause()
>  shutdown(ms->to_dst_file)
>   f->last_error = -EIO
> migrate_detect_error()
>  postcopy_pause()
>   set_state(PAUSED)
>   wait(postcopy_pause_sem)
>

Re: [PATCH v2 4/6] python/machine: use socketpair() for console connections

2023-07-26 Thread John Snow

On Wed, Jul 26, 2023, 6:50 AM Ani Sinha  wrote:

>
>
> > On 25-Jul-2023, at 11:33 PM, John Snow  wrote:
> >
> > Create a socketpair for the console output. This should help eliminate
> > race conditions around console text early in the boot process that might
> > otherwise have been dropped on the floor before being able to connect to
> > QEMU under "server,nowait".
> >
> > Signed-off-by: John Snow 
>
> Thanks for doing this. I recall we spoke about this late last year in the
> context of fixing my bios-bits avocado test and adding a console output
> there.
>

Yep! I think you need a few more changes to do what you wanted. IIRC, you
also want to be able to drain the console log while waiting for the vm to
terminate of its own accord, which I don't support yet.

(If you use console socket's self draining mode, it should be possible to
forego the early termination of the console socket and allow this behavior.
Maybe I can work that in now...)

Anything else I'm forgetting ...?

Except the concern below,
>
> Reviewed-by: Ani Sinha 
>

Thanks 


>
> > ---
> > python/qemu/machine/machine.py | 30 +++---
> > 1 file changed, 27 insertions(+), 3 deletions(-)
> >
> > diff --git a/python/qemu/machine/machine.py
> b/python/qemu/machine/machine.py
> > index 26f0fb8a81..09f214c95c 100644
> > --- a/python/qemu/machine/machine.py
> > +++ b/python/qemu/machine/machine.py
> > @@ -159,6 +159,8 @@ def __init__(self,
> >
> > self._name = name or f"{id(self):x}"
> > self._sock_pair: Optional[Tuple[socket.socket, socket.socket]] =
> None
> > +self._cons_sock_pair: Optional[
> > +Tuple[socket.socket, socket.socket]] = None
> > self._temp_dir: Optional[str] = None
> > self._base_temp_dir = base_temp_dir
> > self._sock_dir = sock_dir
> > @@ -315,8 +317,9 @@ def _base_args(self) -> List[str]:
> > for _ in range(self._console_index):
> > args.extend(['-serial', 'null'])
> > if self._console_set:
> > -chardev = ('socket,id=console,path=%s,server=on,wait=off' %
> > -   self._console_address)
> > +assert self._cons_sock_pair is not None
> > +fd = self._cons_sock_pair[0].fileno()
> > +chardev = f"socket,id=console,fd={fd}"
> > args.extend(['-chardev', chardev])
> > if self._console_device_type is None:
> > args.extend(['-serial', 'chardev:console'])
> > @@ -351,6 +354,10 @@ def _pre_launch(self) -> None:
> > nickname=self._name
> > )
> >
> > +if self._console_set:
> > +self._cons_sock_pair = socket.socketpair()
> > +os.set_inheritable(self._cons_sock_pair[0].fileno(), True)
> > +
> > # NOTE: Make sure any opened resources are *definitely* freed in
> > # _post_shutdown()!
> > # pylint: disable=consider-using-with
> > @@ -368,6 +375,9 @@ def _pre_launch(self) -> None:
> > def _post_launch(self) -> None:
> > if self._sock_pair:
> > self._sock_pair[0].close()
> > +if self._cons_sock_pair:
> > +self._cons_sock_pair[0].close()
> > +
> > if self._qmp_connection:
> > if self._sock_pair:
> > self._qmp.connect()
> > @@ -518,6 +528,11 @@ def _early_cleanup(self) -> None:
> > self._console_socket.close()
> > self._console_socket = None
> >
> > +if self._cons_sock_pair:
> > +self._cons_sock_pair[0].close()
> > +self._cons_sock_pair[1].close()
> > +self._cons_sock_pair = None
> > +
> > def _hard_shutdown(self) -> None:
> > """
> > Perform early cleanup, kill the VM, and wait for it to terminate.
> > @@ -878,10 +893,19 @@ def console_socket(self) -> socket.socket:
> > Returns a socket connected to the console
> > """
> > if self._console_socket is None:
> > +if not self._console_set:
> > +raise QEMUMachineError(
> > +"Attempt to access console socket with no
> connection")
> > +assert self._cons_sock_pair is not None
> > +# os.dup() is used here for sock_fd because otherwise we'd
> > +# have two rich python socket objects that would each try to
> > +# close the same underlying fd when either one gets garbage
> > +# collected.
> > self._console_socket = console_socket.ConsoleSocket(
> > -self._console_address,
> > +sock_fd=os.dup(self._cons_sock_pair[1].fileno()),
> > file=self._console_log_path,
> > drain=self._drain_console)
> > +self._cons_sock_pair[1].close()
>
> I am not 100% sure but should we save the new sock_fd here? Like
> self._cons_sock_pair[1] = sock_fd ;
>
> Then next time console_socket() is invoked, the correct fd will be duped.
>

It should be cached to

Re: [PATCH] migration/ram: Refactor precopy ram loading code

2023-07-26 Thread Peter Xu

On Tue, Jul 25, 2023 at 10:26:51AM -0300, Fabiano Rosas wrote:
> From: Nikolay Borisov 
> 
> Extract the ramblock parsing code into a routine that operates on the
> sequence of headers from the stream and another the parses the
> individual ramblock. This makes ram_load_precopy() easier to
> comprehend.
> 
> Signed-off-by: Nikolay Borisov 
> Signed-off-by: Fabiano Rosas 

Reviewed-by: Peter Xu 

Still a few comments, not directly relevant to this patch.

> ---
> I'm extracting the parts from the fixed-ram migration series [1] that
> could already go in. This patch is one of them.
> 
> 1- https://lore.kernel.org/r/20230330180336.2791-1-faro...@suse.de
> ---
>  migration/ram.c | 141 +++-
>  1 file changed, 79 insertions(+), 62 deletions(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index 0ada6477e8..685e129b70 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -3826,6 +3826,82 @@ void colo_flush_ram_cache(void)
>  trace_colo_flush_ram_cache_end();
>  }
>  
> +static int parse_ramblock(QEMUFile *f, RAMBlock *block, ram_addr_t length)
> +{
> +int ret = 0;
> +/* ADVISE is earlier, it shows the source has the postcopy capability on 
> */
> +bool postcopy_advised = migration_incoming_postcopy_advised();
> +
> +assert(block);
> +
> +if (!qemu_ram_is_migratable(block)) {
> +error_report("block %s should not be migrated !", block->idstr);
> +return -EINVAL;
> +}
> +
> +if (length != block->used_length) {
> +Error *local_err = NULL;
> +
> +ret = qemu_ram_resize(block, length, _err);
> +if (local_err) {
> +error_report_err(local_err);
> +}

Starting from here, IIUC we can already fail the whole thing if ret!=0.
IOW, ram_control_load_hook() may not be required for error cases.

Can leave that until later even if it applies.

> +}
> +/* For postcopy we need to check hugepage sizes match */
> +if (postcopy_advised && migrate_postcopy_ram() &&
> +block->page_size != qemu_host_page_size) {
> +uint64_t remote_page_size = qemu_get_be64(f);
> +if (remote_page_size != block->page_size) {
> +error_report("Mismatched RAM page size %s "
> + "(local) %zd != %" PRId64, block->idstr,
> + block->page_size, remote_page_size);
> +ret = -EINVAL;
> +}
> +}
> +if (migrate_ignore_shared()) {
> +hwaddr addr = qemu_get_be64(f);
> +if (migrate_ram_is_ignored(block) &&
> +block->mr->addr != addr) {
> +error_report("Mismatched GPAs for block %s "
> + "%" PRId64 "!= %" PRId64, block->idstr,
> + (uint64_t)addr, (uint64_t)block->mr->addr);
> +ret = -EINVAL;
> +}
> +}
> +ram_control_load_hook(f, RAM_CONTROL_BLOCK_REG, block->idstr);
> +
> +return ret;
> +}
> +
> +static int parse_ramblocks(QEMUFile *f, ram_addr_t total_ram_bytes)
> +{
> +int ret = 0;
> +
> +/* Synchronize RAM block list */
> +while (!ret && total_ram_bytes) {
> +RAMBlock *block;
> +char id[256];
> +ram_addr_t length;
> +int len = qemu_get_byte(f);
> +
> +qemu_get_buffer(f, (uint8_t *)id, len);
> +id[len] = 0;
> +length = qemu_get_be64(f);
> +
> +block = qemu_ram_block_by_name(id);
> +if (block) {
> +ret = parse_ramblock(f, block, length);
> +} else {
> +error_report("Unknown ramblock \"%s\", cannot accept "
> + "migration", id);
> +ret = -EINVAL;
> +}
> +total_ram_bytes -= length;
> +}
> +
> +return ret;
> +}
> +
>  /**
>   * ram_load_precopy: load pages in precopy case
>   *
> @@ -3840,14 +3916,13 @@ static int ram_load_precopy(QEMUFile *f)
>  {
>  MigrationIncomingState *mis = migration_incoming_get_current();
>  int flags = 0, ret = 0, invalid_flags = 0, len = 0, i = 0;
> -/* ADVISE is earlier, it shows the source has the postcopy capability on 
> */
> -bool postcopy_advised = migration_incoming_postcopy_advised();
> +
>  if (!migrate_compress()) {
>  invalid_flags |= RAM_SAVE_FLAG_COMPRESS_PAGE;
>  }
>  
>  while (!ret && !(flags & RAM_SAVE_FLAG_EOS)) {
> -ram_addr_t addr, total_ram_bytes;
> +ram_addr_t addr;
>  void *host = NULL, *host_bak = NULL;
>  uint8_t ch;
>  
> @@ -3918,65 +3993,7 @@ static int ram_load_precopy(QEMUFile *f)
>  
>  switch (flags & ~RAM_SAVE_FLAG_CONTINUE) {
>  case RAM_SAVE_FLAG_MEM_SIZE:
> -/* Synchronize RAM block list */
> -total_ram_bytes = addr;
> -while (!ret && total_ram_bytes) {
> -RAMBlock *block;
> -char id[256];
> -ram_addr_t length;
> -
> -len = qemu_get_byte(f);
> -

Re: i386/xen: prevent guest from binding loopback event channel to itself

2023-07-26 Thread Bernhard Beschow

Am 26. Juli 2023 09:24:28 UTC schrieb Paul Durrant :
>On 26/07/2023 10:07, David Woodhouse wrote:
>> On Wed, 2023-07-26 at 09:44 +0100, Paul Durrant wrote:
>>> On 25/07/2023 11:05, David Woodhouse wrote:
 From: David Woodhouse 

 Fuzzing showed that a guest could bind an interdomain port to itself, by
 guessing the next port to be allocated and putting that as the 'remote'
 port number. By chance, that works because the newly-allocated port has
 type EVTCHNSTAT_unbound. It shouldn't.

 Signed-off-by: David Woodhouse 
 ---
    hw/i386/kvm/xen_evtchn.c | 11 +--
    1 file changed, 9 insertions(+), 2 deletions(-)

>>> 
>>> Reviewed-by: Paul Durrant 
>>> 
>> 
>> Thanks. I'll change the title prefix to 'hw/xen' since it's in hw/ not
>> target/i386.
>
>Yes, makes sense.
>
>> Please can I have also have a review for
>> https://lore.kernel.org/qemu-devel/20076888f6bdf06a65aafc5cf954260965d45b97.ca...@infradead.org/
>> 
>
>Sorry I missed that. Done.

And that one, too please? 
https://lore.kernel.org/qemu-devel/20230720072950.20198-1-o...@aepfle.de/

Sorry for cross posting, but the patch would be good to have in 8.1.

Best regards,
Bernhard

>
>Cheers,
>
>  Paul
>

Re: [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-26 Thread Stefan Hajnoczi

On Wed, Jul 26, 2023 at 12:02:33PM -0400, Michael S. Tsirkin wrote:
> On Mon, Jul 24, 2023 at 02:08:39PM -0400, Stefan Hajnoczi wrote:
> > On Thu, Jul 20, 2023 at 06:22:08PM -0400, Michael S. Tsirkin wrote:
> > > On Thu, Jul 20, 2023 at 05:31:03PM -0400, Stefan Hajnoczi wrote:
> > > > On Thu, 20 Jul 2023 at 17:15, Michael S. Tsirkin  
> > > > wrote:
> > > > >
> > > > > On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> > > > > > On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > > > > Currently QEMU has to know some details about the back-end to 
> > > > > > > > be able
> > > > > > > > to setup the guest. While various parts of the setup can be 
> > > > > > > > delegated
> > > > > > > > to the backend (for example config handling) this is a very 
> > > > > > > > piecemeal
> > > > > > > > approach.
> > > > > > >
> > > > > > > > This patch suggests a new feature flag 
> > > > > > > > (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > > > > > > which the back-end can advertise which allows a probe message 
> > > > > > > > to be
> > > > > > > > sent to get all the details QEMU needs to know in one message.
> > > > > > >
> > > > > > > The reason we do piecemeal is that these existing pieces can be 
> > > > > > > reused
> > > > > > > as others evolve or fall by wayside.
> > > > > > >
> > > > > > > For example, I can think of instances where you want to connect
> > > > > > > specifically to e.g. networking backend, and specify it
> > > > > > > on command line. Reasons could be many, e.g. for debugging,
> > > > > > > or to prevent connecting to wrong device on wrong channel
> > > > > > > (kind of like type safety).
> > > > > > >
> > > > > > > What is the reason to have 1 message? startup latency?
> > > > > > > How about we allow pipelining several messages then?
> > > > > > > Will be easier.
> > > > > >
> > > > > > This flag effectively says that the back-end is a full VIRTIO device
> > > > > > with a Device Status Register, Configuration Space, Virtqueues, the
> > > > > > device type, etc. This is different from previous vhost-user devices
> > > > > > which sometimes just offloaded certain virtqueues without providing 
> > > > > > the
> > > > > > full VIRTIO device (parts were emulated in the VMM).
> > > > > >
> > > > > > So for example, a vhost-user-net device does not support the 
> > > > > > controlq.
> > > > > > Alex's "standalone" device is a mode where the vhost-user protocol 
> > > > > > is
> > > > > > used but the back-end must implement a full virtio-net device.
> > > > > > Standalone devices are like vDPA device in this respect.
> > > > > >
> > > > > > I think it is important to have a protocol feature bit that 
> > > > > > advertises
> > > > > > that this is a standalone device, since the semantics are different 
> > > > > > for
> > > > > > traditional vhost-user-net devices.
> > > > >
> > > > > Not sure what that would gain as compared to a feature bit per
> > > > > message as we did previously.
> > > > 
> > > > Having a single feature bit makes it easier to distinguish between a
> > > > traditional vhost-user device and a standalone device.
> > > > 
> > > > For example, the presence of VHOST_USER_F_GET_DEVICE_ID doesn't tell
> > > > you whether this device is a standalone device that is appropriate for
> > > > a new generic QEMU --device vhost-user-device feature that Alex is
> > > > working on. It could be a traditional vhost-user device that is not
> > > > standalone but implements the VHOST_USER_GET_DEVICE_ID message.
> > > > 
> > > > How will we detect standalone devices? It will be messy if there is no
> > > > single feature bit that advertises that this back-end is a standalone
> > > > device.
> > > > 
> > > > Stefan
> > > 
> > > Looks like standalone implies some 5-6 messages to be supported.
> > > So just test the 6 bits are all ones.
> > 
> > It's not clear to me that the individual bits together mean this is
> > really a standalone device, but let's go with individual commands and
> > see if a front-end can distinguish standalone devices or not. If not,
> > then we can still add "standalone" feature bit before merging the code.
> > 
> > Stefan
> 
> 
> I think it just shows that what a "standalone" device is just isn't
> that well defined ;).

No, I think it's the opposite way around. The existing vhost model is
not well defined. From time to time someone has to extend it to add
things like Configuration Space support or VHOST_USER_GET_DEVICE_ID.

Standalone devices are well defined: they are what the VIRTIO
specification describes.

My concern is that new messages added for standalone devices might also
be useful for existing non-standalone devices. Or do you want to declare
these new messages off-limits to non-standalone devices?

Stefan


signature.asc
Description: PGP signature

Re: intel-iommu: Report interrupt remapping faults, fix return value

2023-07-26 Thread Peter Xu

On Tue, Jul 25, 2023 at 11:01:16AM +0100, David Woodhouse wrote:
> From: David Woodhouse 
> 
> A generic X86IOMMUClass->int_remap function should not return VT-d
> specific values; fix it to return 0 if the interrupt was successfully
> translated or -EINVAL if not.
> 
> The VTD_FR_IR_xxx values are supposed to be used to actually raise
> faults through the fault reporting mechanism, so do that instead for
> the case where the IRQ is actually being injected.
> 
> There is more work to be done here, as pretranslations for the KVM IRQ
> routing table can't fault; an untranslatable IRQ should be handled in
> userspace and the fault raised only when the IRQ actually happens (if
> indeed the IRTE is still not valid at that time). But we can work on
> that later; we can at least raise faults for the direct case.
> 
> Signed-off-by: David Woodhouse 

Mostly good to me, only still a few trivial comments..

[...]

> @@ -3320,17 +3342,33 @@ static int vtd_irte_get(IntelIOMMUState *iommu, 
> uint16_t index,
>  entry, sizeof(*entry), MEMTXATTRS_UNSPECIFIED)) {
>  error_report_once("%s: read failed: ind=0x%x addr=0x%" PRIx64,
>__func__, index, addr);
> -return -VTD_FR_IR_ROOT_INVAL;
> +if (do_fault) {
> +vtd_report_ir_fault(iommu, sid, VTD_FR_IR_ROOT_INVAL, index);
> +}
> +return false;
>  }
>  
>  trace_vtd_ir_irte_get(index, le64_to_cpu(entry->data[1]),
>le64_to_cpu(entry->data[0]));
>  
> + /*
> +  * The remaining potential fault conditions are "qualified" by the
> +  * Fault Processing Disable bit in the IRTE. Even "not present".
> +  * So just clear the do_fault flag if PFD is set, which will
> +  * prevent faults being raised.
> +  */
> + if (entry->irte.fault_disable) {
> + do_fault = false;

Wrong indent on these lines..

> +}
> +
>  if (!entry->irte.present) {
>  error_report_once("%s: detected non-present IRTE "
>"(index=%u, high=0x%" PRIx64 ", low=0x%" PRIx64 
> ")",
>__func__, index, le64_to_cpu(entry->data[1]),
>le64_to_cpu(entry->data[0]));
> +if (do_fault) {
> +vtd_report_ir_fault(iommu, sid, VTD_FR_IR_ENTRY_P, index);
> +}
>  return -VTD_FR_IR_ENTRY_P;

Forgot to change retval?

>  }

Thanks,

-- 
Peter Xu

Re: [PATCH v10 09/10] migration: Implement MigrateChannelList to hmp migration flow.

2023-07-26 Thread Het Gala




On 26/07/23 8:25 pm, Daniel P. Berrangé wrote:

On Wed, Jul 26, 2023 at 02:18:32PM +, Het Gala wrote:

Integrate MigrateChannelList with all transport backends
(socket, exec and rdma) for both src and dest migration
endpoints for hmp migration.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
  migration/migration-hmp-cmds.c | 15 +--
  migration/migration.c  |  5 ++---
  migration/migration.h  |  3 ++-
  3 files changed, 17 insertions(+), 6 deletions(-)

Reviewed-by: Daniel P. Berrangé 

Daniel, I have got reviewed-by / Acked-by from Markus (patch 1 & 6) and 
you (remaining patches). What should be the next steps here ? Do I need 
to send a new patchset / wait for more maintainers for their reviews ? 
Please advice.

With regards,
Daniel

Regards,
Het Gala

Re: [PULL 00/25] Migration 20230726 patches

2023-07-26 Thread Richard Henderson


On 7/26/23 05:14, Juan Quintela wrote:

The following changes since commit 6cb2011fedf8c4e7b66b4a3abd6b42c1bae99ce6:

   Update version for v8.1.0-rc1 release (2023-07-25 20:09:05 +0100)

are available in the Git repository at:

   https://gitlab.com/juan.quintela/qemu.git  
tags/migration-20230726-pull-request

for you to fetch changes up to 697c4c86ab515a728ffb2adc2c3c04b22fa9210f:

   migration/rdma: Split qemu_fopen_rdma() into input/output functions 
(2023-07-26 10:55:56 +0200)


Migration Pull request

Hi

This is the migration PULL request.  It is the same than yesterday with proper 
PULL headers.
It pass CI. It contains:
- Fabiano rosas trheadinfo cleanups
- Hyman Huang dirtylimit changes
- Part of my changes
- Peter Xu documentation
- Tejus updato to migration descriptions
- Wei want improvements for postocpy and multifd setup

Please apply.

Thanks, Juan.

---


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~

[PATCH] gitlab: remove duplication between msys jobs

2023-07-26 Thread Daniel P . Berrangé

Although they share a common parent, the two msys jobs still have
massive duplication in their script definitions that can easily be
collapsed.

Signed-off-by: Daniel P. Berrangé 
---
 .gitlab-ci.d/windows.yml | 132 +++
 1 file changed, 49 insertions(+), 83 deletions(-)

diff --git a/.gitlab-ci.d/windows.yml b/.gitlab-ci.d/windows.yml
index f889a468b5..f086540e40 100644
--- a/.gitlab-ci.d/windows.yml
+++ b/.gitlab-ci.d/windows.yml
@@ -35,97 +35,63 @@
   - .\msys64\usr\bin\bash -lc 'pacman --noconfirm -Syuu'  # Core update
   - .\msys64\usr\bin\bash -lc 'pacman --noconfirm -Syuu'  # Normal update
   - taskkill /F /FI "MODULES eq msys-2.0.dll"
-
-msys2-64bit:
-  extends: .shared_msys2_builder
   script:
   - .\msys64\usr\bin\bash -lc "pacman -Sy --noconfirm --needed
   bison diffutils flex
   git grep make sed
-  mingw-w64-x86_64-capstone
-  mingw-w64-x86_64-curl
-  mingw-w64-x86_64-cyrus-sasl
-  mingw-w64-x86_64-dtc
-  mingw-w64-x86_64-gcc
-  mingw-w64-x86_64-glib2
-  mingw-w64-x86_64-gnutls
-  mingw-w64-x86_64-gtk3
-  mingw-w64-x86_64-libgcrypt
-  mingw-w64-x86_64-libjpeg-turbo
-  mingw-w64-x86_64-libnfs
-  mingw-w64-x86_64-libpng
-  mingw-w64-x86_64-libssh
-  mingw-w64-x86_64-libtasn1
-  mingw-w64-x86_64-libusb
-  mingw-w64-x86_64-lzo2
-  mingw-w64-x86_64-nettle
-  mingw-w64-x86_64-ninja
-  mingw-w64-x86_64-pixman
-  mingw-w64-x86_64-pkgconf
-  mingw-w64-x86_64-python
-  mingw-w64-x86_64-SDL2
-  mingw-w64-x86_64-SDL2_image
-  mingw-w64-x86_64-snappy
-  mingw-w64-x86_64-spice
-  mingw-w64-x86_64-usbredir
-  mingw-w64-x86_64-zstd "
+  $MINGW_TARGET-capstone
+  $MINGW_TARGET-curl
+  $MINGW_TARGET-cyrus-sasl
+  $MINGW_TARGET-dtc
+  $MINGW_TARGET-gcc
+  $MINGW_TARGET-glib2
+  $MINGW_TARGET-gnutls
+  $MINGW_TARGET-gtk3
+  $MINGW_TARGET-libgcrypt
+  $MINGW_TARGET-libjpeg-turbo
+  $MINGW_TARGET-libnfs
+  $MINGW_TARGET-libpng
+  $MINGW_TARGET-libssh
+  $MINGW_TARGET-libtasn1
+  $MINGW_TARGET-libusb
+  $MINGW_TARGET-lzo2
+  $MINGW_TARGET-nettle
+  $MINGW_TARGET-ninja
+  $MINGW_TARGET-pixman
+  $MINGW_TARGET-pkgconf
+  $MINGW_TARGET-python
+  $MINGW_TARGET-SDL2
+  $MINGW_TARGET-SDL2_image
+  $MINGW_TARGET-snappy
+  $MINGW_TARGET-spice
+  $MINGW_TARGET-usbredir
+  $MINGW_TARGET-zstd "
   - $env:CHERE_INVOKING = 'yes'  # Preserve the current working directory
-  - $env:MSYSTEM = 'MINGW64' # Start a 64-bit MinGW environment
   - $env:MSYS = 'winsymlinks:native' # Enable native Windows symlink
   - mkdir build
   - cd build
-  # Note: do not remove "--without-default-devices"!
-  # commit 9f8e6cad65a6 ("gitlab-ci: Speed up the msys2-64bit job by using 
--without-default-devices"
-  # changed to compile QEMU with the --without-default-devices switch
-  # for the msys2 64-bit job, due to the build could not complete within
-  # the project timeout.
-  - ..\msys64\usr\bin\bash -lc '../configure --target-list=x86_64-softmmu
-  --without-default-devices --enable-fdt=system'
-  - ..\msys64\usr\bin\bash -lc 'make'
-  # qTests don't run successfully with "--without-default-devices",
-  # so let's exclude the qtests from CI for now.
-  - ..\msys64\usr\bin\bash -lc 'make check MTESTARGS=\"--no-suite qtest\" || { 
cat meson-logs/testlog.txt; exit 1; } ;'
+  - ..\msys64\usr\bin\bash -lc "../configure --enable-fdt=system 
$CONFIGURE_ARGS"
+  - ..\msys64\usr\bin\bash -lc "make"
+  - ..\msys64\usr\bin\bash -lc "make check MTESTARGS='$TEST_ARGS' || { cat 
meson-logs/testlog.txt; exit 1; } ;"
+
+msys2-64bit:
+  extends: .shared_msys2_builder
+  variables:
+MINGW_TARGET: mingw-w64-x86_64
+MSYSTEM: MINGW64
+# do not remove "--without-default-devices"!
+# commit 9f8e6cad65a6 ("gitlab-ci: Speed up the msys2-64bit job by using 
--without-default-devices"
+# changed to compile QEMU with the --without-default-devices switch
+# for the msys2 64-bit job, due to the build could not complete within
+CONFIGURE_ARGS:  --target-list=x86_64-softmmu --without-default-devices
+# qTests don't run successfully with "--without-default-devices",
+# so let's exclude the qtests from CI for now.
+TEST_ARGS: --no-suite qtest
 
 msys2-32bit:
   extends: .shared_msys2_builder
-  script:
-  - .\msys64\usr\bin\bash -lc "pacman -Sy --noconfirm --needed
-  bison diffutils flex
-  git grep make sed
-  mingw-w64-i686-capstone
-  mingw-w64-i686-curl
-  mingw-w64-i686-cyrus-sasl
-  mingw-w64-i686-dtc
-  mingw-w64-i686-gcc
-  mingw-w64-i686-glib2
-  mingw-w64-i686-gnutls
-  mingw-w64-i686-gtk3
-  mingw-w64-i686-libgcrypt
-  mingw-w64-i686-libjpeg-turbo
-  mingw-w64-i686-libnfs
-  mingw-w64-i686-libpng
-  mingw-w64-i686-libssh
-  mingw-w64-i686-libtasn1
-  mingw-w64-i686-libusb
-

Re: [Qemu PATCH v2 5/9] hw/mem/cxl_type3: Add host backend and address space handling for DC regions

2023-07-26 Thread ni...@outlook.com

The 07/26/2023 07:53, Nathan Fontenot wrote:
> On 7/25/23 13:39, Fan Ni wrote:
> > From: Fan Ni 
> > 
> > Add (file/memory backed) host backend, all the dynamic capacity regions
> > will share a single, large enough host backend. Set up address space for
> > DC regions to support read/write operations to dynamic capacity for DCD.
> > 
> > With the change, following supports are added:
> > 1. add a new property to type3 device "nonvolatile-dc-memdev" to point to 
> > host
> >memory backend for dynamic capacity;
> > 2. add namespace for dynamic capacity for read/write support;
> > 3. create cdat entries for each dynamic capacity region;
> > 4. fix dvsec range registers to include DC regions.
> > 
> > Signed-off-by: Fan Ni 
> > ---
> >  hw/cxl/cxl-mailbox-utils.c  |  19 +++-
> >  hw/mem/cxl_type3.c  | 203 +---
> >  include/hw/cxl/cxl_device.h |   4 +
> >  3 files changed, 185 insertions(+), 41 deletions(-)
> > 
> > diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> > index dd5ea95af8..0511b8e6f7 100644
> > --- a/hw/cxl/cxl-mailbox-utils.c
> > +++ b/hw/cxl/cxl-mailbox-utils.c
> > @@ -388,9 +388,11 @@ static CXLRetCode cmd_firmware_update_get_info(struct 
> > cxl_cmd *cmd,
> >  char fw_rev4[0x10];
> >  } QEMU_PACKED *fw_info;
> >  QEMU_BUILD_BUG_ON(sizeof(*fw_info) != 0x50);
> > +CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> >  
> >  if ((cxl_dstate->vmem_size < CXL_CAPACITY_MULTIPLIER) ||
> > -(cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER)) {
> > +(cxl_dstate->pmem_size < CXL_CAPACITY_MULTIPLIER) ||
> > +(ct3d->dc.total_capacity < CXL_CAPACITY_MULTIPLIER)) {
> >  return CXL_MBOX_INTERNAL_ERROR;
> >  }
> >  
> > @@ -531,7 +533,8 @@ static CXLRetCode cmd_identify_memory_device(struct 
> > cxl_cmd *cmd,
> >  CXLType3Class *cvc = CXL_TYPE3_GET_CLASS(ct3d);
> >  
> >  if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) 
> > ||
> > -(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, 
> > CXL_CAPACITY_MULTIPLIER))) {
> > +(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER)) 
> > ||
> > +(!QEMU_IS_ALIGNED(ct3d->dc.total_capacity, 
> > CXL_CAPACITY_MULTIPLIER))) {
> >  return CXL_MBOX_INTERNAL_ERROR;
> >  }
> >  
> > @@ -566,9 +569,11 @@ static CXLRetCode cmd_ccls_get_partition_info(struct 
> > cxl_cmd *cmd,
> >  uint64_t next_pmem;
> >  } QEMU_PACKED *part_info = (void *)cmd->payload;
> >  QEMU_BUILD_BUG_ON(sizeof(*part_info) != 0x20);
> > +CXLType3Dev *ct3d = container_of(cxl_dstate, CXLType3Dev, cxl_dstate);
> >  
> >  if ((!QEMU_IS_ALIGNED(cxl_dstate->vmem_size, CXL_CAPACITY_MULTIPLIER)) 
> > ||
> > -(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, 
> > CXL_CAPACITY_MULTIPLIER))) {
> > +(!QEMU_IS_ALIGNED(cxl_dstate->pmem_size, CXL_CAPACITY_MULTIPLIER)) 
> > ||
> > +(!QEMU_IS_ALIGNED(ct3d->dc.total_capacity, 
> > CXL_CAPACITY_MULTIPLIER))) {
> >  return CXL_MBOX_INTERNAL_ERROR;
> >  }
> >  
> > @@ -880,7 +885,13 @@ static CXLRetCode cmd_media_clear_poison(struct 
> > cxl_cmd *cmd,
> >  struct clear_poison_pl *in = (void *)cmd->payload;
> >  
> >  dpa = ldq_le_p(>dpa);
> > -if (dpa + CXL_CACHE_LINE_SIZE > cxl_dstate->static_mem_size) {
> > +if (dpa + CXL_CACHE_LINE_SIZE >= cxl_dstate->static_mem_size
> > +&& ct3d->dc.num_regions == 0) {
> > +return CXL_MBOX_INVALID_PA;
> > +}
> > +
> > +if (ct3d->dc.num_regions && dpa + CXL_CACHE_LINE_SIZE >=
> > +cxl_dstate->static_mem_size + ct3d->dc.total_capacity) {
> >  return CXL_MBOX_INVALID_PA;
> >  }
> >  
> > diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> > index b29bb2309a..76bbd9f785 100644
> > --- a/hw/mem/cxl_type3.c
> > +++ b/hw/mem/cxl_type3.c
> > @@ -20,6 +20,7 @@
> >  #include "hw/pci/spdm.h"
> >  
> >  #define DWORD_BYTE 4
> > +#define CXL_CAPACITY_MULTIPLIER   (256 * MiB)
> >  
> >  /* Default CDAT entries for a memory region */
> >  enum {
> > @@ -33,8 +34,8 @@ enum {
> >  };
> >  
> >  static int ct3_build_cdat_entries_for_mr(CDATSubHeader **cdat_table,
> > - int dsmad_handle, MemoryRegion 
> > *mr,
> > - bool is_pmem, uint64_t dpa_base)
> > +int dsmad_handle, uint8_t flags,
> > +uint64_t dpa_base, uint64_t size)
> >  {
> >  g_autofree CDATDsmas *dsmas = NULL;
> >  g_autofree CDATDslbis *dslbis0 = NULL;
> > @@ -53,9 +54,9 @@ static int ct3_build_cdat_entries_for_mr(CDATSubHeader 
> > **cdat_table,
> >  .length = sizeof(*dsmas),
> >  },
> >  .DSMADhandle = dsmad_handle,
> > -.flags = is_pmem ? CDAT_DSMAS_FLAG_NV : 0,
> > +.flags = flags,
> >  .DPA_base = dpa_base,
> > -.DPA_length = memory_region_size(mr),
> > +.DPA_length = size,
> >  };
> >  
>

Re: [PULL 0/2] Misc next patches

2023-07-26 Thread Richard Henderson


On 7/25/23 09:23, Daniel P. Berrangé wrote:

The following changes since commit a279ca4ea07383314b2d2b2f1d550be9482f148e:

   Merge tag 'pull-target-arm-20230725' 
ofhttps://git.linaro.org/people/pmaydell/qemu-arm  into staging (2023-07-25 
12:44:39 +0100)

are available in the Git repository at:

   https://gitlab.com/berrange/qemu  tags/misc-next-pull-request

for you to fetch changes up to 095be0910b89b5d156e20641bd65ac6cab3f8305:

   hw/usb/canokey: change license to GPLv2+ (2023-07-25 17:15:59 +0100)


Miscellaneous fixes

  * Switch canokey device license from Apache to GPLv2+
  * Fix uninitialized variable warning in LUKS code


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~

Re: [PULL v2 0/5] QAPI patches patches for 2023-07-26

2023-07-26 Thread Richard Henderson


On 7/26/23 05:57, Markus Armbruster wrote:

The following changes since commit 6cb2011fedf8c4e7b66b4a3abd6b42c1bae99ce6:

   Update version for v8.1.0-rc1 release (2023-07-25 20:09:05 +0100)

are available in the Git repository at:

   https://repo.or.cz/qemu/armbru.git  tags/pull-qapi-2023-07-26-v2

for you to fetch changes up to 9e272073e1c41acb3ba1e43b69c7a3f9c26089c2:

   qapi: Reformat recent doc comments to conform to current conventions 
(2023-07-26 14:51:36 +0200)


QAPI patches patches for 2023-07-26


The patches affect only documentation.  Generated code does not change.

Markus Armbruster (5):
   qapi/block-core: Tidy up BlockLatencyHistogramInfo documentation
   qapi/block: Tidy up block-latency-histogram-set documentation
   qapi/qdev: Tidy up device_add documentation
   qapi/trace: Tidy up trace-event-get-state, -set-state documentation
   qapi: Reformat recent doc comments to conform to current conventions


Applied, thanks.  Please update https://wiki.qemu.org/ChangeLog/8.1 as 
appropriate.


r~

Re: [virtio-dev] [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-26 Thread Michael S. Tsirkin

On Thu, Jul 20, 2023 at 03:36:01PM -0400, Stefan Hajnoczi wrote:
> On Fri, Jul 07, 2023 at 12:27:39PM +0200, Stefano Garzarella wrote:
> > On Tue, Jul 04, 2023 at 04:02:42PM +0100, Alex Bennée wrote:
> > > 
> > > Stefano Garzarella  writes:
> > > 
> > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > > > > index 5a070adbc1..85b1b1583a 100644
> > > > > --- a/docs/interop/vhost-user.rst
> > > > > +++ b/docs/interop/vhost-user.rst
> > > > > @@ -275,6 +275,21 @@ Inflight description
> > > > > 
> > > > > :queue size: a 16-bit size of virtqueues
> > > > > 
> > > > > +Backend specifications
> > > > > +^^
> > > > > +
> > > > > ++---+-+++
> > > > > +| device id | config size |   min_vqs  |   max_vqs  |
> > > > > ++---+-+++
> > > > > +
> > > > > +:device id: a 32-bit value holding the VirtIO device ID
> > > > > +
> > > > > +:config size: a 32-bit value holding the config size (see 
> > > > > ``VHOST_USER_GET_CONFIG``)
> > > > > +
> > > > > +:min_vqs: a 32-bit value holding the minimum number of vqs supported
> > > > 
> > > > Why do we need the minimum?
> > > 
> > > We need to know the minimum number because some devices have fixed VQs
> > > that must be present.
> > 
> > But does QEMU need to know this?
> > 
> > Or is it okay that the driver will then fail in the guest if there
> > are not the right number of queues?
> 
> I don't understand why min_vqs is needed either. It's not the
> front-end's job to ensure that the device will be used properly. A
> spec-compliant driver will work with a spec-compliant device, so it's
> not clear why the front-end needs this information.
> 
> Stefan



I think this really demonstrates why we should keep separate
messages and not the "standalone" thing which seems to
mean "bundle a bunch of mostly unrelated stuff in one message":
this way each field is carefully documented.


-- 
MST

Re: [virtio-dev] [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-26 Thread Stefan Hajnoczi

On Wed, 26 Jul 2023 at 11:42, Erik Schilling  wrote:
>
> On Tue Jul 4, 2023 at 4:54 PM CEST, Stefano Garzarella wrote:
> > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > >Currently QEMU has to know some details about the back-end to be able
> > >to setup the guest. While various parts of the setup can be delegated
> > >to the backend (for example config handling) this is a very piecemeal
> > >approach.
> > >
> > >This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> > >which the back-end can advertise which allows a probe message to be
> > >sent to get all the details QEMU needs to know in one message.
> > >
> > >Signed-off-by: Alex Bennée 
> > >
> > >---
> > >Initial RFC for discussion. I intend to prototype this work with QEMU
> > >and one of the rust-vmm vhost-user daemons.
> >
> > Thanks for starting this discussion!
> >
> > I'm comparing with vhost-vdpa IOCTLs, so my questions may be
> > superficial, but they help me understand the differences.
> >
> > >---
> > > docs/interop/vhost-user.rst | 37 +
> > > hw/virtio/vhost-user.c  |  8 
> > > 2 files changed, 45 insertions(+)
> > >
> > >diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > >index 5a070adbc1..85b1b1583a 100644
> > >--- a/docs/interop/vhost-user.rst
> > >+++ b/docs/interop/vhost-user.rst
> > >@@ -275,6 +275,21 @@ Inflight description
> > >
> > > :queue size: a 16-bit size of virtqueues
> > >
> > >+Backend specifications
> > >+^^
> > >+
> > >++---+-+++
> > >+| device id | config size |   min_vqs  |   max_vqs  |
> > >++---+-+++
> > >+
> > >+:device id: a 32-bit value holding the VirtIO device ID
> > >+
> > >+:config size: a 32-bit value holding the config size (see 
> > >``VHOST_USER_GET_CONFIG``)
> > >+
> > >+:min_vqs: a 32-bit value holding the minimum number of vqs supported
> >
> > Why do we need the minimum?
> >
> > >+
> > >+:max_vqs: a 32-bit value holding the maximum number of vqs supported, 
> > >must be >= min_vqs
> >
> > Is this overlap with VHOST_USER_GET_QUEUE_NUM?
>
> While fiddeling with a rust-vmm implementation of this I wondered:
>
> Would a standalone daemon even need VHOST_USER_PROTOCOL_F_MQ if
> VHOST_USER_PROTOCOL_F_CONFIG is required anyway? It looks like
> all full virtio devices provide config information that allows to
> derive the exact or maximum number of queues already. So wouldn't
> VHOST_USER_PROTOCOL_F_MQ just provide a second, redundant way to get
> the information? (And this would be a third?)
>
> Am I missing something here? Otherwise, I think we could drop dependency
> on VHOST_USER_PROTOCOL_F_MQ and get the info from the device config for
> standalone daemons?

vhost (in general and that includes vhost-user) was not designed to be
a full VIRTIO device, just an interface for offloading specific
virtqueues. Now that vhost-user is being extended to support full
VIRTIO devices ("standalone"), some of the vhost-user protocol
features are redundant.

However, actually dropping the redundant features must be done
carefully because they might be used in ways that you don't expect by
existing vhost-user implementations. For example, maybe existing
vhost-user-net code doesn't look at the VIRTIO Configuration Space
field because it queries the number of virtqueues via
VHOST_USER_PROTOCOL_F_MQ + VHOST_USER_GET_QUEUE_NUM. So you might be
forced to keep them for compatibility.

I think the answer to your questions is: yes, but carefully because
you might break existing implementations or make it hard for them to
remain compatible.

Stefan

Re: [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-26 Thread Michael S. Tsirkin

On Mon, Jul 24, 2023 at 02:08:39PM -0400, Stefan Hajnoczi wrote:
> On Thu, Jul 20, 2023 at 06:22:08PM -0400, Michael S. Tsirkin wrote:
> > On Thu, Jul 20, 2023 at 05:31:03PM -0400, Stefan Hajnoczi wrote:
> > > On Thu, 20 Jul 2023 at 17:15, Michael S. Tsirkin  wrote:
> > > >
> > > > On Thu, Jul 20, 2023 at 03:58:37PM -0400, Stefan Hajnoczi wrote:
> > > > > On Thu, Jul 06, 2023 at 12:48:20PM -0400, Michael S. Tsirkin wrote:
> > > > > > On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> > > > > > > Currently QEMU has to know some details about the back-end to be 
> > > > > > > able
> > > > > > > to setup the guest. While various parts of the setup can be 
> > > > > > > delegated
> > > > > > > to the backend (for example config handling) this is a very 
> > > > > > > piecemeal
> > > > > > > approach.
> > > > > >
> > > > > > > This patch suggests a new feature flag 
> > > > > > > (VHOST_USER_PROTOCOL_F_STANDALONE)
> > > > > > > which the back-end can advertise which allows a probe message to 
> > > > > > > be
> > > > > > > sent to get all the details QEMU needs to know in one message.
> > > > > >
> > > > > > The reason we do piecemeal is that these existing pieces can be 
> > > > > > reused
> > > > > > as others evolve or fall by wayside.
> > > > > >
> > > > > > For example, I can think of instances where you want to connect
> > > > > > specifically to e.g. networking backend, and specify it
> > > > > > on command line. Reasons could be many, e.g. for debugging,
> > > > > > or to prevent connecting to wrong device on wrong channel
> > > > > > (kind of like type safety).
> > > > > >
> > > > > > What is the reason to have 1 message? startup latency?
> > > > > > How about we allow pipelining several messages then?
> > > > > > Will be easier.
> > > > >
> > > > > This flag effectively says that the back-end is a full VIRTIO device
> > > > > with a Device Status Register, Configuration Space, Virtqueues, the
> > > > > device type, etc. This is different from previous vhost-user devices
> > > > > which sometimes just offloaded certain virtqueues without providing 
> > > > > the
> > > > > full VIRTIO device (parts were emulated in the VMM).
> > > > >
> > > > > So for example, a vhost-user-net device does not support the controlq.
> > > > > Alex's "standalone" device is a mode where the vhost-user protocol is
> > > > > used but the back-end must implement a full virtio-net device.
> > > > > Standalone devices are like vDPA device in this respect.
> > > > >
> > > > > I think it is important to have a protocol feature bit that advertises
> > > > > that this is a standalone device, since the semantics are different 
> > > > > for
> > > > > traditional vhost-user-net devices.
> > > >
> > > > Not sure what that would gain as compared to a feature bit per
> > > > message as we did previously.
> > > 
> > > Having a single feature bit makes it easier to distinguish between a
> > > traditional vhost-user device and a standalone device.
> > > 
> > > For example, the presence of VHOST_USER_F_GET_DEVICE_ID doesn't tell
> > > you whether this device is a standalone device that is appropriate for
> > > a new generic QEMU --device vhost-user-device feature that Alex is
> > > working on. It could be a traditional vhost-user device that is not
> > > standalone but implements the VHOST_USER_GET_DEVICE_ID message.
> > > 
> > > How will we detect standalone devices? It will be messy if there is no
> > > single feature bit that advertises that this back-end is a standalone
> > > device.
> > > 
> > > Stefan
> > 
> > Looks like standalone implies some 5-6 messages to be supported.
> > So just test the 6 bits are all ones.
> 
> It's not clear to me that the individual bits together mean this is
> really a standalone device, but let's go with individual commands and
> see if a front-end can distinguish standalone devices or not. If not,
> then we can still add "standalone" feature bit before merging the code.
> 
> Stefan


I think it just shows that what a "standalone" device is just isn't
that well defined ;).

-- 
MST

[PATCH 2/3] Rework i.MX7 device implementation/instantiation

2023-07-26 Thread Jean-Christophe Dubois

From: jcdubois 

* Add Addr and size definition for all i.MX7 devices in i.MX7 header file.
* Use those newly defined named constants whenever possible.
* Standardize the way we init a familly of unimplemented devices
  - SAI
  - PWM
  - CAN
* Add TZASC as unimplemented device.
  - Allow bare metal application to access this (unimplemented) device
* Add CSU as unimplemented device.
  - Allow bare metal application to access this (unimplemented) device
* Add various memory segments
  - OCRAM
  - OCRAM EPDC
  - OCRAM PXP
  - OCRAM S
  - ROM
  - CAAM
* Add/rework few comments

Signed-off-by: jcdubois 
---
 hw/arm/fsl-imx7.c | 197 +-
 include/hw/arm/fsl-imx7.h | 346 +-
 2 files changed, 419 insertions(+), 124 deletions(-)

diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index 9e41d4b677..05cdd4831e 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -36,6 +36,9 @@ static void fsl_imx7_init(Object *obj)
 char name[NAME_SIZE];
 int i;
 
+/*
+ * CPUs
+ */
 for (i = 0; i < MIN(ms->smp.cpus, FSL_IMX7_NUM_CPUS); i++) {
 snprintf(name, NAME_SIZE, "cpu%d", i);
 object_initialize_child(obj, name, >cpu[i],
@@ -49,7 +52,7 @@ static void fsl_imx7_init(Object *obj)
 TYPE_A15MPCORE_PRIV);
 
 /*
- * GPIOs 1 to 7
+ * GPIOs
  */
 for (i = 0; i < FSL_IMX7_NUM_GPIOS; i++) {
 snprintf(name, NAME_SIZE, "gpio%d", i);
@@ -57,7 +60,7 @@ static void fsl_imx7_init(Object *obj)
 }
 
 /*
- * GPT1, 2, 3, 4
+ * GPTs
  */
 for (i = 0; i < FSL_IMX7_NUM_GPTS; i++) {
 snprintf(name, NAME_SIZE, "gpt%d", i);
@@ -79,19 +82,24 @@ static void fsl_imx7_init(Object *obj)
  */
 object_initialize_child(obj, "gpcv2", >gpcv2, TYPE_IMX_GPCV2);
 
+/*
+ * ECSPIs
+ */
 for (i = 0; i < FSL_IMX7_NUM_ECSPIS; i++) {
 snprintf(name, NAME_SIZE, "spi%d", i + 1);
 object_initialize_child(obj, name, >spi[i], TYPE_IMX_SPI);
 }
 
-
+/*
+ * I2Cs
+ */
 for (i = 0; i < FSL_IMX7_NUM_I2CS; i++) {
 snprintf(name, NAME_SIZE, "i2c%d", i + 1);
 object_initialize_child(obj, name, >i2c[i], TYPE_IMX_I2C);
 }
 
 /*
- * UART
+ * UARTs
  */
 for (i = 0; i < FSL_IMX7_NUM_UARTS; i++) {
 snprintf(name, NAME_SIZE, "uart%d", i);
@@ -99,7 +107,7 @@ static void fsl_imx7_init(Object *obj)
 }
 
 /*
- * Ethernet
+ * Ethernets
  */
 for (i = 0; i < FSL_IMX7_NUM_ETHS; i++) {
 snprintf(name, NAME_SIZE, "eth%d", i);
@@ -107,7 +115,7 @@ static void fsl_imx7_init(Object *obj)
 }
 
 /*
- * SDHCI
+ * SDHCIs
  */
 for (i = 0; i < FSL_IMX7_NUM_USDHCS; i++) {
 snprintf(name, NAME_SIZE, "usdhc%d", i);
@@ -120,7 +128,7 @@ static void fsl_imx7_init(Object *obj)
 object_initialize_child(obj, "snvs", >snvs, TYPE_IMX7_SNVS);
 
 /*
- * Watchdog
+ * Watchdogs
  */
 for (i = 0; i < FSL_IMX7_NUM_WDTS; i++) {
 snprintf(name, NAME_SIZE, "wdt%d", i);
@@ -132,8 +140,14 @@ static void fsl_imx7_init(Object *obj)
  */
 object_initialize_child(obj, "gpr", >gpr, TYPE_IMX7_GPR);
 
+/*
+ * PCIE
+ */
 object_initialize_child(obj, "pcie", >pcie, TYPE_DESIGNWARE_PCIE_HOST);
 
+/*
+ * USBs
+ */
 for (i = 0; i < FSL_IMX7_NUM_USBS; i++) {
 snprintf(name, NAME_SIZE, "usb%d", i);
 object_initialize_child(obj, name, >usb[i], TYPE_CHIPIDEA);
@@ -156,6 +170,9 @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 return;
 }
 
+/*
+ * CPUs
+ */
 for (i = 0; i < smp_cpus; i++) {
 o = OBJECT(>cpu[i]);
 
@@ -206,10 +223,10 @@ static void fsl_imx7_realize(DeviceState *dev, Error 
**errp)
  * A7MPCORE DAP
  */
 create_unimplemented_device("a7mpcore-dap", FSL_IMX7_A7MPCORE_DAP_ADDR,
-0x10);
+FSL_IMX7_A7MPCORE_DAP_SIZE);
 
 /*
- * GPT1, 2, 3, 4
+ * GPTs
  */
 for (i = 0; i < FSL_IMX7_NUM_GPTS; i++) {
 static const hwaddr FSL_IMX7_GPTn_ADDR[FSL_IMX7_NUM_GPTS] = {
@@ -234,6 +251,9 @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 FSL_IMX7_GPTn_IRQ[i]));
 }
 
+/*
+ * GPIOs
+ */
 for (i = 0; i < FSL_IMX7_NUM_GPIOS; i++) {
 static const hwaddr FSL_IMX7_GPIOn_ADDR[FSL_IMX7_NUM_GPIOS] = {
 FSL_IMX7_GPIO1_ADDR,
@@ -279,18 +299,14 @@ static void fsl_imx7_realize(DeviceState *dev, Error 
**errp)
 }
 
 /*
- * IOMUXC and IOMUXC_LPSR
+ * IOMUXC, IOMUXC_GPR and IOMUXC_LPSR
  */
-for (i = 0; i < FSL_IMX7_NUM_IOMUXCS; i++) {
-static const hwaddr FSL_IMX7_IOMUXCn_ADDR[FSL_IMX7_NUM_IOMUXCS] = {
-FSL_IMX7_IOMUXC_ADDR,
-FSL_IMX7_IOMUXC_LPSR_ADDR,
-};
-
-snprintf(name,

Re: [PATCH 1/2] block/blkio: fix opening virtio-blk drivers

2023-07-26 Thread Stefano Garzarella

On Wed, Jul 26, 2023 at 11:32:10AM -0400, Stefan Hajnoczi wrote:

On Wed, Jul 26, 2023 at 09:26:45AM +0200, Stefano Garzarella wrote:

On Tue, Jul 25, 2023 at 04:00:38PM -0400, Stefan Hajnoczi wrote:
> On Mon, Jul 24, 2023 at 05:46:10PM +0200, Stefano Garzarella wrote:
> > libblkio 1.3.0 added support of "fd" property for virtio-blk-vhost-vdpa
> > driver. In QEMU, starting from commit cad2ccc395 ("block/blkio: use
> > qemu_open() to support fd passing for virtio-blk") we are using
> > `blkio_get_int(..., "fd")` to check if the "fd" property is supported
> > for all the virtio-blk-* driver.
> >
> > Unfortunately that property is also available for those driver that do
> > not support it, such as virtio-blk-vhost-user. Indeed now QEMU is
> > failing if used with virtio-blk-vhost-user in this way:
> >
> >-blockdev 
node-name=drive0,driver=virtio-blk-vhost-user,path=vhost-user-blk.sock,cache.direct=on:
 Could not open 'vhost-user-blk.sock': No such device or address
> >
> > So, `blkio_get_int()` is not enough to check whether the driver supports
> > the `fd` property or not. This is because the virito-blk common libblkio
> > driver only checks whether or not `fd` is set during `blkio_connect()`
> > and fails for those transports that do not support it (all except
> > vhost-vdpa for now).
> >
> > So for now let's also check that the driver is virtio-blk-vhost-vdpa,
> > since that's the only one that supports it.
>
> What happens when more virtio-blk-* libblkio drivers gain support for
> `fd`? I think we'll be back to the same problem because QEMU will be
> unable to distinguish between old and new libraries.

If we release a v1.3.1 version of libblkio with
https://gitlab.com/libblkio/libblkio/-/merge_requests/208
we can set a minimum requirement in QEMU and use blkio_set_fd() here.

>
> How about retrying with `path` if opening with `fd` fails?

IIUC the only way is to check if blkio_connect() will fail with -EINVAL,
that can also be generated by other issues, then retry forcing `path`.

Do you see other ways?

No. Checking for -EINVAL and then retrying with `path` is what I had in
mind.

The code wouldn't be great, but we could do it for now and then when
we release a new version of libblkio, do the revert and use
blkio_set_int(fd) to see if it's supported or not.

I don't know if it is worth it, or if it is better to merge this,
release libblkio v1.3.1 and force the minimum requirement.

WDYT?

I prefer retrying on -EINVAL because even if libblkio 1.3.1 is released
today, we don't have control over when it becomes available in distros.

Fair point!

I'll send v2 using that approach!

Thanks,
Stefano

[PATCH 3/3] Add i.MX7 SRC device implementation

2023-07-26 Thread Jean-Christophe Dubois

From: jcdubois 

The SRC device is normaly used to start the secondary CPU.

When running Linux directly, Qemu is emulating a PSCI interface that UBOOT
is installing at boot time and therefore the fact that the SRC device is
unimplemented is hidden as Qemu respond directly to PSCI requets without
using the SRC device.

But if you try to run a more bare metal application (maybe uboot itself),
then it is not possible to start the secondary CPU as the SRC is an
unimplemented device.

This patch adds the ability to start the secondary CPU through the SRC
device so that you can use this feature in bare metal application.

Signed-off-by: jcdubois 
---
 hw/arm/fsl-imx7.c  |   9 +-
 hw/misc/imx7_src.c | 289 +
 hw/misc/meson.build|   1 +
 include/hw/arm/fsl-imx7.h  |   2 +
 include/hw/misc/imx7_src.h |  68 +
 5 files changed, 368 insertions(+), 1 deletion(-)
 create mode 100644 hw/misc/imx7_src.c
 create mode 100644 include/hw/misc/imx7_src.h

diff --git a/hw/arm/fsl-imx7.c b/hw/arm/fsl-imx7.c
index 05cdd4831e..db103069e1 100644
--- a/hw/arm/fsl-imx7.c
+++ b/hw/arm/fsl-imx7.c
@@ -82,6 +82,11 @@ static void fsl_imx7_init(Object *obj)
  */
 object_initialize_child(obj, "gpcv2", >gpcv2, TYPE_IMX_GPCV2);
 
+/*
+ * SRC
+ */
+object_initialize_child(obj, "src", >src, TYPE_IMX7_SRC);
+
 /*
  * ECSPIs
  */
@@ -90,6 +95,7 @@ static void fsl_imx7_init(Object *obj)
 object_initialize_child(obj, name, >spi[i], TYPE_IMX_SPI);
 }
 
+
 /*
  * I2Cs
  */
@@ -490,7 +496,8 @@ static void fsl_imx7_realize(DeviceState *dev, Error **errp)
 /*
  * SRC
  */
-create_unimplemented_device("src", FSL_IMX7_SRC_ADDR, FSL_IMX7_SRC_SIZE);
+sysbus_realize(SYS_BUS_DEVICE(>src), _abort);
+sysbus_mmio_map(SYS_BUS_DEVICE(>src), 0, FSL_IMX7_SRC_ADDR);
 
 /*
  * Watchdogs
diff --git a/hw/misc/imx7_src.c b/hw/misc/imx7_src.c
new file mode 100644
index 00..b1b7d11e8f
--- /dev/null
+++ b/hw/misc/imx7_src.c
@@ -0,0 +1,289 @@
+/*
+ * IMX7 System Reset Controller
+ *
+ * Copyright (c) 2023 Jean-Christophe Dubois 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "hw/misc/imx7_src.h"
+#include "migration/vmstate.h"
+#include "qemu/bitops.h"
+#include "qemu/log.h"
+#include "qemu/main-loop.h"
+#include "qemu/module.h"
+#include "target/arm/arm-powerctl.h"
+#include "hw/core/cpu.h"
+
+#define DEBUG_IMX7_SRC 1
+#ifndef DEBUG_IMX7_SRC
+#define DEBUG_IMX7_SRC 0
+#endif
+
+#define DPRINTF(fmt, args...) \
+do { \
+if (DEBUG_IMX7_SRC) { \
+fprintf(stderr, "[%s]%s: " fmt , TYPE_IMX7_SRC, \
+ __func__, ##args); \
+} \
+} while (0)
+
+static const char *imx7_src_reg_name(uint32_t reg)
+{
+static char unknown[20];
+
+switch (reg) {
+case SRC_SCR:
+return "SRC_SCR";
+case SRC_A7RCR0:
+return "SRC_A7RCR0";
+case SRC_A7RCR1:
+return "SRC_A7RCR1";
+case SRC_M4RCR:
+return "SRC_M4RCR";
+case SRC_ERCR:
+return "SRC_ERCR";
+case SRC_HSICPHY_RCR:
+return "SRC_HSICPHY_RCR";
+case SRC_USBOPHY1_RCR:
+return "SRC_USBOPHY1_RCR";
+case SRC_USBOPHY2_RCR:
+return "SRC_USBOPHY2_RCR";
+case SRC_PCIEPHY_RCR:
+return "SRC_PCIEPHY_RCR";
+case SRC_SBMR1:
+return "SRC_SBMR1";
+case SRC_SRSR:
+return "SRC_SRSR";
+case SRC_SISR:
+return "SRC_SISR";
+case SRC_SIMR:
+return "SRC_SIMR";
+case SRC_SBMR2:
+return "SRC_SBMR2";
+case SRC_GPR1:
+return "SRC_GPR1";
+case SRC_GPR2:
+return "SRC_GPR2";
+case SRC_GPR3:
+return "SRC_GPR3";
+case SRC_GPR4:
+return "SRC_GPR4";
+case SRC_GPR5:
+return "SRC_GPR5";
+case SRC_GPR6:
+return "SRC_GPR6";
+case SRC_GPR7:
+return "SRC_GPR7";
+case SRC_GPR8:
+return "SRC_GPR8";
+case SRC_GPR9:
+return "SRC_GPR9";
+case SRC_GPR10:
+return "SRC_GPR10";
+default:
+sprintf(unknown, "%u ?", reg);
+return unknown;
+}
+}
+
+static const VMStateDescription vmstate_imx7_src = {
+.name = TYPE_IMX7_SRC,
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32_ARRAY(regs, IMX7SRCState, SRC_MAX),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static void imx7_src_reset(DeviceState *dev)
+{
+IMX7SRCState *s = IMX7_SRC(dev);
+
+DPRINTF("\n");
+
+memset(s->regs, 0, sizeof(s->regs));
+
+/* Set reset values */
+s->regs[SRC_SCR] = 0xA0;
+s->regs[SRC_SRSR] = 0x1;
+s->regs[SRC_SIMR] = 0x1F;
+}
+
+static uint64_t imx7_src_read(void *opaque, hwaddr offset, unsigned size)
+{
+uint32_t value = 0;
+IMX7SRCState *s =

[PATCH 1/3] Rework i.MX6UL device implementation/instantiation

2023-07-26 Thread Jean-Christophe Dubois

From: jcdubois 

* Add Addr and size definition for all i.MX6UL devices in i.MX6UL header file.
* Use those newly defined named constants whenever possible.
* Standardize the way we init a familly of unimplemented devices
  - SAI
  - PWM (add missing PWM instances)
  - CAN
* Add TZASC as unimplemented device.
  - Allow bare metal application to access this (unimplemented) device
* Add CSU as unimplemented device.
  - Allow bare metal application to access this (unimplemented) device
* Change CAAM specific memory from ROM to RAM.
* Add/rework few comments

Signed-off-by: jcdubois 
---
 hw/arm/fsl-imx6ul.c | 163 +---
 include/hw/arm/fsl-imx6ul.h | 149 +---
 2 files changed, 252 insertions(+), 60 deletions(-)

diff --git a/hw/arm/fsl-imx6ul.c b/hw/arm/fsl-imx6ul.c
index 2189dcbb72..75aaf2adb4 100644
--- a/hw/arm/fsl-imx6ul.c
+++ b/hw/arm/fsl-imx6ul.c
@@ -69,7 +69,7 @@ static void fsl_imx6ul_init(Object *obj)
 object_initialize_child(obj, "gpr", >gpr, TYPE_IMX7_GPR);
 
 /*
- * GPIOs 1 to 5
+ * GPIOs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_GPIOS; i++) {
 snprintf(name, NAME_SIZE, "gpio%d", i);
@@ -77,7 +77,7 @@ static void fsl_imx6ul_init(Object *obj)
 }
 
 /*
- * GPT 1, 2
+ * GPTs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_GPTS; i++) {
 snprintf(name, NAME_SIZE, "gpt%d", i);
@@ -85,7 +85,7 @@ static void fsl_imx6ul_init(Object *obj)
 }
 
 /*
- * EPIT 1, 2
+ * EPITs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_EPITS; i++) {
 snprintf(name, NAME_SIZE, "epit%d", i + 1);
@@ -93,7 +93,7 @@ static void fsl_imx6ul_init(Object *obj)
 }
 
 /*
- * eCSPI
+ * eCSPIs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_ECSPIS; i++) {
 snprintf(name, NAME_SIZE, "spi%d", i + 1);
@@ -101,7 +101,7 @@ static void fsl_imx6ul_init(Object *obj)
 }
 
 /*
- * I2C
+ * I2Cs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_I2CS; i++) {
 snprintf(name, NAME_SIZE, "i2c%d", i + 1);
@@ -109,7 +109,7 @@ static void fsl_imx6ul_init(Object *obj)
 }
 
 /*
- * UART
+ * UARTs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_UARTS; i++) {
 snprintf(name, NAME_SIZE, "uart%d", i);
@@ -117,25 +117,31 @@ static void fsl_imx6ul_init(Object *obj)
 }
 
 /*
- * Ethernet
+ * Ethernets
  */
 for (i = 0; i < FSL_IMX6UL_NUM_ETHS; i++) {
 snprintf(name, NAME_SIZE, "eth%d", i);
 object_initialize_child(obj, name, >eth[i], TYPE_IMX_ENET);
 }
 
-/* USB */
+/*
+ * USB PHYs
+ */
 for (i = 0; i < FSL_IMX6UL_NUM_USB_PHYS; i++) {
 snprintf(name, NAME_SIZE, "usbphy%d", i);
 object_initialize_child(obj, name, >usbphy[i], TYPE_IMX_USBPHY);
 }
+
+/*
+ * USBs
+ */
 for (i = 0; i < FSL_IMX6UL_NUM_USBS; i++) {
 snprintf(name, NAME_SIZE, "usb%d", i);
 object_initialize_child(obj, name, >usb[i], TYPE_CHIPIDEA);
 }
 
 /*
- * SDHCI
+ * SDHCIs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_USDHCS; i++) {
 snprintf(name, NAME_SIZE, "usdhc%d", i);
@@ -143,7 +149,7 @@ static void fsl_imx6ul_init(Object *obj)
 }
 
 /*
- * Watchdog
+ * Watchdogs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_WDTS; i++) {
 snprintf(name, NAME_SIZE, "wdt%d", i);
@@ -189,10 +195,10 @@ static void fsl_imx6ul_realize(DeviceState *dev, Error 
**errp)
  * A7MPCORE DAP
  */
 create_unimplemented_device("a7mpcore-dap", FSL_IMX6UL_A7MPCORE_DAP_ADDR,
-0x10);
+FSL_IMX6UL_A7MPCORE_DAP_SIZE);
 
 /*
- * GPT 1, 2
+ * GPTs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_GPTS; i++) {
 static const hwaddr FSL_IMX6UL_GPTn_ADDR[FSL_IMX6UL_NUM_GPTS] = {
@@ -217,7 +223,7 @@ static void fsl_imx6ul_realize(DeviceState *dev, Error 
**errp)
 }
 
 /*
- * EPIT 1, 2
+ * EPITs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_EPITS; i++) {
 static const hwaddr FSL_IMX6UL_EPITn_ADDR[FSL_IMX6UL_NUM_EPITS] = {
@@ -242,7 +248,7 @@ static void fsl_imx6ul_realize(DeviceState *dev, Error 
**errp)
 }
 
 /*
- * GPIO
+ * GPIOs
  */
 for (i = 0; i < FSL_IMX6UL_NUM_GPIOS; i++) {
 static const hwaddr FSL_IMX6UL_GPIOn_ADDR[FSL_IMX6UL_NUM_GPIOS] = {
@@ -286,15 +292,10 @@ static void fsl_imx6ul_realize(DeviceState *dev, Error 
**errp)
 /*
  * IOMUXC and IOMUXC_GPR
  */
-for (i = 0; i < 1; i++) {
-static const hwaddr FSL_IMX6UL_IOMUXCn_ADDR[FSL_IMX6UL_NUM_IOMUXCS] = {
-FSL_IMX6UL_IOMUXC_ADDR,
-FSL_IMX6UL_IOMUXC_GPR_ADDR,
-};
-
-snprintf(name, NAME_SIZE, "iomuxc%d", i);
-create_unimplemented_device(name, FSL_IMX6UL_IOMUXCn_ADDR[i], 0x4000);
-}
+create_unimplemented_device("iomuxc", FSL_IMX6UL_IOMUXC_ADDR,
+FSL_IMX6UL_IOMUXC_SIZE);
+

[PATCH 0/3] Complete i.MX6UL and i.MX7 processor for bare metal application.

2023-07-26 Thread Jean-Christophe Dubois

This patch adds a few unimplemented TZ devices (TZASC and CSU) to
i.MX6UL and i.MX7 processors to avoid bare metal application to
experiment "bus error" when acccessing these devices.

It also adds some internal memory segments (OCRAM) to the i.MX7 to
allow bare metal application to use them.

Last, it adds the SRC device to the i.MX7 processor to allow bare
metal application to start the secondary Cortex-A7 core.

Note: When running Linux inside Qemu, the secondary core is started
by calling PSCI API and Qemu is emulating PSCI without needing access
to the SRC device. This is why Linux is using the 2 cores in Qemu
even if the SRC is not implemented. This is not the case when running
bare metal application (like u-boot itself) that do not rely on the
PSCI service being available.

Jean-Christophe Dubois (3):
  Rework i.MX6UL device implementation/instantiation
  Rework i.MX7 device implementation/instantiation
  Add i.MX7 SRC device implementation

 hw/arm/fsl-imx6ul.c | 163 -
 hw/arm/fsl-imx7.c   | 204 -
 hw/misc/imx7_src.c  | 289 ++
 hw/misc/meson.build |   1 +
 include/hw/arm/fsl-imx6ul.h | 149 +--
 include/hw/arm/fsl-imx7.h   | 348 +++-
 include/hw/misc/imx7_src.h  |  68 +++
 7 files changed, 1038 insertions(+), 184 deletions(-)
 create mode 100644 hw/misc/imx7_src.c
 create mode 100644 include/hw/misc/imx7_src.h

-- 
2.34.1

Re: [PATCH v2] block/blkio: do not use open flags in qemu_open()

2023-07-26 Thread Stefan Hajnoczi

On Wed, Jul 26, 2023 at 09:48:07AM +0200, Stefano Garzarella wrote:
> qemu_open() in blkio_virtio_blk_common_open() is used to open the
> character device (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or in
> the future eventually the unix socket.
> 
> In all these cases we cannot open the path in read-only mode,
> when the `read-only` option of blockdev is on, because the exchange
> of IOCTL commands for example will fail.
> 
> In order to open the device read-only, we have to use the `read-only`
> property of the libblkio driver as we already do in blkio_file_open().
> 
> Fixes: cad2ccc395 ("block/blkio: use qemu_open() to support fd passing for 
> virtio-blk")
> Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2225439
> Reported-by: Qing Wang 
> Signed-off-by: Stefano Garzarella 
> ---
> 
> Notes:
> v2:
> - added comment on top of qemu_open() [Daniel]
> 
> v1: 
> https://lore.kernel.org/qemu-devel/2023072555.85426-1-sgarz...@redhat.com/
> 
>  block/blkio.c | 21 -
>  1 file changed, 12 insertions(+), 9 deletions(-)

Thanks, applied to my block tree:
https://gitlab.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature

Re: [PATCH] ui/dbus: fix clang compilation issue

2023-07-26 Thread Thomas Huth


On 26/07/2023 17.12, marcandre.lur...@redhat.com wrote:

From: Marc-André Lureau 

../ui/dbus-listener.c:236:9: error: expected expression
 Error *err = NULL;

See:
https://gitlab.com/qemu-project/qemu/-/issues/1782#note_1488517427

Signed-off-by: Marc-André Lureau 
---
  ui/dbus-listener.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/ui/dbus-listener.c b/ui/dbus-listener.c
index 02fc6ae239..30917271ab 100644
--- a/ui/dbus-listener.c
+++ b/ui/dbus-listener.c
@@ -232,7 +232,7 @@ static void dbus_call_update_gl(DisplayChangeListener *dcl,
  egl_fb_read_rect(ddl->ds, >fb, x, y, w, h);
  dbus_gfx_update(dcl, x, y, w, h);
  break;
-case SHARE_KIND_D3DTEX:
+case SHARE_KIND_D3DTEX: {
  Error *err = NULL;
  assert(ddl->d3d_texture);
  
@@ -249,6 +249,7 @@ static void dbus_call_update_gl(DisplayChangeListener *dcl,

  dbus_update_gl_cb,
  g_object_ref(ddl));
  break;
+}
  default:
  g_warn_if_reached();
  }


Reviewed-by: Thomas Huth

Re: [PATCH 1/2] block/blkio: fix opening virtio-blk drivers

2023-07-26 Thread Stefan Hajnoczi

On Wed, Jul 26, 2023 at 09:26:45AM +0200, Stefano Garzarella wrote:
> On Tue, Jul 25, 2023 at 04:00:38PM -0400, Stefan Hajnoczi wrote:
> > On Mon, Jul 24, 2023 at 05:46:10PM +0200, Stefano Garzarella wrote:
> > > libblkio 1.3.0 added support of "fd" property for virtio-blk-vhost-vdpa
> > > driver. In QEMU, starting from commit cad2ccc395 ("block/blkio: use
> > > qemu_open() to support fd passing for virtio-blk") we are using
> > > `blkio_get_int(..., "fd")` to check if the "fd" property is supported
> > > for all the virtio-blk-* driver.
> > > 
> > > Unfortunately that property is also available for those driver that do
> > > not support it, such as virtio-blk-vhost-user. Indeed now QEMU is
> > > failing if used with virtio-blk-vhost-user in this way:
> > > 
> > >-blockdev 
> > > node-name=drive0,driver=virtio-blk-vhost-user,path=vhost-user-blk.sock,cache.direct=on:
> > >  Could not open 'vhost-user-blk.sock': No such device or address
> > > 
> > > So, `blkio_get_int()` is not enough to check whether the driver supports
> > > the `fd` property or not. This is because the virito-blk common libblkio
> > > driver only checks whether or not `fd` is set during `blkio_connect()`
> > > and fails for those transports that do not support it (all except
> > > vhost-vdpa for now).
> > > 
> > > So for now let's also check that the driver is virtio-blk-vhost-vdpa,
> > > since that's the only one that supports it.
> > 
> > What happens when more virtio-blk-* libblkio drivers gain support for
> > `fd`? I think we'll be back to the same problem because QEMU will be
> > unable to distinguish between old and new libraries.
> 
> If we release a v1.3.1 version of libblkio with
> https://gitlab.com/libblkio/libblkio/-/merge_requests/208
> we can set a minimum requirement in QEMU and use blkio_set_fd() here.
> 
> > 
> > How about retrying with `path` if opening with `fd` fails?
> 
> IIUC the only way is to check if blkio_connect() will fail with -EINVAL,
> that can also be generated by other issues, then retry forcing `path`.
> 
> Do you see other ways?

No. Checking for -EINVAL and then retrying with `path` is what I had in
mind.

> The code wouldn't be great, but we could do it for now and then when
> we release a new version of libblkio, do the revert and use
> blkio_set_int(fd) to see if it's supported or not.
> 
> I don't know if it is worth it, or if it is better to merge this,
> release libblkio v1.3.1 and force the minimum requirement.
> 
> WDYT?

I prefer retrying on -EINVAL because even if libblkio 1.3.1 is released
today, we don't have control over when it becomes available in distros.

Stefan


signature.asc
Description: PGP signature

Re: [PATCH] semihosting/uaccess.c: Replaced a malloc call with g_malloc.

2023-07-26 Thread Richard Henderson


On 7/26/23 02:43, Peter Maydell wrote:

(Something went wrong with the quoting in your email. I've
fixed it up.)

On Wed, 26 Jul 2023 at 05:38,  wrote:

Peter Maydell wrote:

The third part here, is that g_malloc() does not ever
fail -- it will abort() on out of memory. However
the code here is still handling g_malloc() returning NULL.
The equivalent for "we expect this might fail" (which we want
here, because the guest is passing us the length of memory
to try to allocate) is g_try_malloc().



g_malloc() is preferred more than g_try_* functions, which return NULL on error,
when the size of the requested allocation  is small.
This is because allocating few bytes should not be a problem in a healthy 
system.


This is true. But in this particular case we cannot be sure
that the size of the allocation is small, because the size
is controlled by the guest. So we want g_try_malloc().


And why do we want to use g_try_malloc instead of just sticking with malloc?

I see no reason to change anything at all here.


r~

[PATCH v2] softmmu/physmem: try opening file readonly before failure in file_ram_open

2023-07-26 Thread Thiner Logoer

Users may give "-mem-path" a read only file and expect the file
to be mapped read-write privately. Allow this but give a warning
since other users may surprise when the ram file is readonly and
qemu suddenly aborts elsewhere.

Suggested-by: David Hildenbrand 
Signed-off-by: Thiner Logoer 
---

See the previous version at:
https://lore.kernel.org/qemu-devel/96a462ec-6f9d-fd83-f697-73e132432...@redhat.com/T/

verified, this patch works for my setup, both functionality and the warning
are expected behavior.

Also another problem when I look at the file_ram_open

When readonly is true and the path is a directory, the open will succeed but
any later operations will fail since it is a directory fd. This may require
additional commits which is out of my scope. Merely record the question here.

 softmmu/physmem.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/softmmu/physmem.c b/softmmu/physmem.c
index 3df73542e1..e8279d69d4 100644
--- a/softmmu/physmem.c
+++ b/softmmu/physmem.c
@@ -1296,6 +1296,7 @@ static int file_ram_open(const char *path,
 char *sanitized_name;
 char *c;
 int fd = -1;
+bool first_trial = true;
 
 *created = false;
 for (;;) {
@@ -1332,6 +1333,18 @@ static int file_ram_open(const char *path,
 break;
 }
 g_free(filename);
+} else if (first_trial && !readonly && errno == EACCES) {
+/* @path may be a read only file */
+fd = open(path, O_RDONLY);
+if (fd >= 0) {
+/*
+ * Sometimes this behavior is not desired. Fire a warning but
+ * continue.
+ */
+warn_report("backing store %s is opened readonly because the"
+"file is not writable", path);
+break;
+}
 }
 if (errno != EEXIST && errno != EINTR) {
 error_setg_errno(errp, errno,
@@ -1343,6 +1356,7 @@ static int file_ram_open(const char *path,
  * Try again on EINTR and EEXIST.  The latter happens when
  * something else creates the file between our two open().
  */
+first_trial = false;
 }
 
 return fd;
-- 
2.40.1

[PATCH] ui/dbus: fix clang compilation issue

2023-07-26 Thread marcandre . lureau

From: Marc-André Lureau 

../ui/dbus-listener.c:236:9: error: expected expression
Error *err = NULL;

See:
https://gitlab.com/qemu-project/qemu/-/issues/1782#note_1488517427

Signed-off-by: Marc-André Lureau 
---
 ui/dbus-listener.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/ui/dbus-listener.c b/ui/dbus-listener.c
index 02fc6ae239..30917271ab 100644
--- a/ui/dbus-listener.c
+++ b/ui/dbus-listener.c
@@ -232,7 +232,7 @@ static void dbus_call_update_gl(DisplayChangeListener *dcl,
 egl_fb_read_rect(ddl->ds, >fb, x, y, w, h);
 dbus_gfx_update(dcl, x, y, w, h);
 break;
-case SHARE_KIND_D3DTEX:
+case SHARE_KIND_D3DTEX: {
 Error *err = NULL;
 assert(ddl->d3d_texture);
 
@@ -249,6 +249,7 @@ static void dbus_call_update_gl(DisplayChangeListener *dcl,
 dbus_update_gl_cb,
 g_object_ref(ddl));
 break;
+}
 default:
 g_warn_if_reached();
 }
-- 
2.41.0

[PATCH 1/1] qemu-nbd: regression with arguments passing into nbd_client_thread()

2023-07-26 Thread Denis V. Lunev

Unfortunately
commit 03b67621445d601c9cdc7dfe25812e9f19b81488
Author: Denis V. Lunev 
Date:   Mon Jul 17 16:55:40 2023 +0200
qemu-nbd: pass structure into nbd_client_thread instead of plain char*
has introduced a regression. struct NbdClientOpts resides on stack inside
'if' block. This specifically means that this stack space could be reused
once the execution will leave that block of the code.

This means that parameters passed into nbd_client_thread could be
overwritten at any moment.

The patch moves the data to the namespace of main() function effectively
preserving it for the whole process lifetime.

Signed-off-by: Denis V. Lunev 
CC: Eric Blake 
CC: Vladimir Sementsov-Ogievskiy 
CC: 
---
 qemu-nbd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 5b2757920c..7a15085ade 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -589,6 +589,7 @@ int main(int argc, char **argv)
 const char *pid_file_name = NULL;
 const char *selinux_label = NULL;
 BlockExportOptions *export_opts;
+struct NbdClientOpts opts;
 
 #ifdef CONFIG_POSIX
 os_setup_early_signal_handling();
@@ -1145,7 +1146,7 @@ int main(int argc, char **argv)
 if (device) {
 #if HAVE_NBD_DEVICE
 int ret;
-struct NbdClientOpts opts = {
+opts = (struct NbdClientOpts) {
 .device = device,
 .fork_process = fork_process,
 .verbose = verbose,
-- 
2.34.1

Re: [PATCH] migration: Allow user to specify migration available bandwidth

2023-07-26 Thread Peter Xu

On Tue, Jul 25, 2023 at 06:10:18PM +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 25, 2023 at 12:38:23PM -0400, Peter Xu wrote:
> > I see you used "convergance" explicitly even after PeterM's reply, is that
> > what you prefer over "convergence"?  I do see more occurances of
> > "convergence" as a word in migration context, though.
> 
> Ignore my speling erors :-)

Ohh, so that's not intended. :) Actually there's indeed the word
"convergance" which can be applied here, but since I'm not native I really
can't figure out the fact even with a dictionary.

> 
> >   Any better name you
> > can come up with, before I just go with "max-convergence-bandwidth" (I
> > really cannot come up with anything better than this or available-bandwidth
> > for now)?
> 
> Anothre idea could be 'max-switchover-bandwidth'  ?

Yeah, switchover can also be called a phase of migration, so in parallel of
existing precopy/postcopy, sounds like a good one.  I'll respin with that
if no further suggestions.  Thanks.

-- 
Peter Xu

Re: [PATCH v10 08/10] migration: Implement MigrateChannelList to qmp migration flow.

2023-07-26 Thread Daniel P . Berrangé

On Wed, Jul 26, 2023 at 02:18:31PM +, Het Gala wrote:
> Integrate MigrateChannelList with all transport backends
> (socket, exec and rdma) for both src and dest migration
> endpoints for qmp migration.
> 
> For current series, limit the size of MigrateChannelList
> to single element (single interface) as runtime check.
> 
> Suggested-by: Aravind Retnakaran 
> Signed-off-by: Het Gala 
> ---
>  migration/migration.c | 95 +++
>  1 file changed, 52 insertions(+), 43 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PULL v2 5/5] qapi: Reformat recent doc comments to conform to current conventions

2023-07-26 Thread Markus Armbruster

Since commit a937b6aa739 (qapi: Reformat doc comments to conform to
current conventions), a number of comments not conforming to the
current formatting conventions were added.  No problem, just sweep
the entire documentation once more.

To check the generated documentation does not change, I compared the
generated HTML before and after this commit with "wdiff -3".  Finds no
differences.  Comparing with diff is not useful, as the reflown
paragraphs are visible there.

Signed-off-by: Markus Armbruster 
Message-ID: <20230720071610.1096458-7-arm...@redhat.com>
---
 qapi/block-core.json | 78 +++-
 qapi/block.json  |  4 +--
 qapi/cxl.json|  4 +--
 qapi/machine-target.json |  2 +-
 qapi/migration.json  | 10 +++---
 qapi/net.json|  1 -
 qapi/qom.json|  9 ++---
 qapi/trace.json  |  2 ++
 qapi/ui.json |  2 +-
 9 files changed, 55 insertions(+), 57 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index dcfd54d15c..2b1d493d6e 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -136,7 +136,7 @@
 #
 # @filename: Name of the extent file
 #
-# @format: Extent type (e.g.  FLAT or SPARSE)
+# @format: Extent type (e.g. FLAT or SPARSE)
 #
 # @virtual-size: Number of bytes covered by this extent
 #
@@ -853,9 +853,8 @@
 # @min_wr_latency_ns: Minimum latency of write operations in the
 # defined interval, in nanoseconds.
 #
-# @min_zone_append_latency_ns: Minimum latency of zone append operations
-#  in the defined interval, in nanoseconds
-#  (since 8.1)
+# @min_zone_append_latency_ns: Minimum latency of zone append
+# operations in the defined interval, in nanoseconds (since 8.1)
 #
 # @min_flush_latency_ns: Minimum latency of flush operations in the
 # defined interval, in nanoseconds.
@@ -866,9 +865,8 @@
 # @max_wr_latency_ns: Maximum latency of write operations in the
 # defined interval, in nanoseconds.
 #
-# @max_zone_append_latency_ns: Maximum latency of zone append operations
-#  in the defined interval, in nanoseconds
-#  (since 8.1)
+# @max_zone_append_latency_ns: Maximum latency of zone append
+# operations in the defined interval, in nanoseconds (since 8.1)
 #
 # @max_flush_latency_ns: Maximum latency of flush operations in the
 # defined interval, in nanoseconds.
@@ -879,9 +877,8 @@
 # @avg_wr_latency_ns: Average latency of write operations in the
 # defined interval, in nanoseconds.
 #
-# @avg_zone_append_latency_ns: Average latency of zone append operations
-#  in the defined interval, in nanoseconds
-#  (since 8.1)
+# @avg_zone_append_latency_ns: Average latency of zone append
+# operations in the defined interval, in nanoseconds (since 8.1)
 #
 # @avg_flush_latency_ns: Average latency of flush operations in the
 # defined interval, in nanoseconds.
@@ -893,8 +890,7 @@
 # the defined interval.
 #
 # @avg_zone_append_queue_depth: Average number of pending zone append
-#   operations in the defined interval
-#   (since 8.1).
+# operations in the defined interval (since 8.1).
 #
 # Since: 2.5
 ##
@@ -919,8 +915,8 @@
 #
 # @wr_bytes: The number of bytes written by the device.
 #
-# @zone_append_bytes: The number of bytes appended by the zoned devices
-# (since 8.1)
+# @zone_append_bytes: The number of bytes appended by the zoned
+# devices (since 8.1)
 #
 # @unmap_bytes: The number of bytes unmapped by the device (Since 4.2)
 #
@@ -930,8 +926,8 @@
 # @wr_operations: The number of write operations performed by the
 # device.
 #
-# @zone_append_operations: The number of zone append operations performed
-#  by the zoned devices (since 8.1)
+# @zone_append_operations: The number of zone append operations
+# performed by the zoned devices (since 8.1)
 #
 # @flush_operations: The number of cache flush operations performed by
 # the device (since 0.15)
@@ -946,7 +942,7 @@
 # 0.15).
 #
 # @zone_append_total_time_ns: Total time spent on zone append writes
-# in nanoseconds (since 8.1)
+# in nanoseconds (since 8.1)
 #
 # @flush_total_time_ns: Total time spent on cache flushes in
 # nanoseconds (since 0.15).
@@ -965,8 +961,8 @@
 # @wr_merged: Number of write requests that have been merged into
 # another request (Since 2.3).
 #
-# @zone_append_merged: Number of zone append requests that have been merged
-#  into another request (since 8.1)
+# @zone_append_merged: Number of zone append requests that have been
+# merged into another request (since 8.1)
 #
 # @unmap_merged: Number of unmap requests that have been merged into
 # another request (Since 4.2)
@@ -981,9 +977,8 @@
 #

[PATCH 11/44] Introduce Raspberry PI 4 machine

2023-07-26 Thread Sergey Kambalin

Signed-off-by: Sergey Kambalin 
---
 hw/arm/bcm2835_peripherals.c|  20 +++-
 hw/arm/bcm2836.c|   2 +
 hw/arm/bcm2838.c|   2 +
 hw/arm/meson.build  |   2 +-
 hw/arm/raspi.c  |  28 +++--
 hw/arm/raspi4b.c| 182 
 include/hw/arm/raspi_platform.h |  11 ++
 include/hw/display/bcm2835_fb.h |   2 +
 8 files changed, 235 insertions(+), 14 deletions(-)
 create mode 100644 hw/arm/raspi4b.c

diff --git a/hw/arm/bcm2835_peripherals.c b/hw/arm/bcm2835_peripherals.c
index 4c0c0b1e7d..9e4153936f 100644
--- a/hw/arm/bcm2835_peripherals.c
+++ b/hw/arm/bcm2835_peripherals.c
@@ -108,6 +108,7 @@ static void raspi_peripherals_base_init(Object *obj)
 /* Framebuffer */
 object_initialize_child(obj, "fb", >fb, TYPE_BCM2835_FB);
 object_property_add_alias(obj, "vcram-size", OBJECT(>fb), "vcram-size");
+object_property_add_alias(obj, "vcram-base", OBJECT(>fb), "vcram-base");
 
 object_property_add_const_link(OBJECT(>fb), "dma-mr",
OBJECT(>gpu_bus_mr));
@@ -225,7 +226,7 @@ void raspi_peripherals_common_realize(DeviceState *dev, 
Error **errp)
 Object *obj;
 MemoryRegion *ram;
 Error *err = NULL;
-uint64_t ram_size, vcram_size;
+uint64_t ram_size, vcram_size, vcram_base;
 int n;
 
 obj = object_property_get_link(OBJECT(dev), "ram", _abort);
@@ -329,11 +330,24 @@ void raspi_peripherals_common_realize(DeviceState *dev, 
Error **errp)
 return;
 }
 
-if (!object_property_set_uint(OBJECT(>fb), "vcram-base",
-  ram_size - vcram_size, errp)) {
+vcram_base = object_property_get_uint(OBJECT(s), "vcram-base", );
+if (err) {
+error_propagate(errp, err);
 return;
 }
 
+if (vcram_base == 0) {
+vcram_base = (ram_size > UPPER_RAM_BASE ? UPPER_RAM_BASE : ram_size)
+- vcram_size;
+} else {
+if (vcram_base + vcram_size > UPPER_RAM_BASE) {
+vcram_base = UPPER_RAM_BASE - vcram_size;
+}
+}
+if (!object_property_set_uint(OBJECT(>fb), "vcram-base", vcram_base,
+  errp)) {
+return;
+}
 if (!sysbus_realize(SYS_BUS_DEVICE(>fb), errp)) {
 return;
 }
diff --git a/hw/arm/bcm2836.c b/hw/arm/bcm2836.c
index 8beafb97f0..34883d29ff 100644
--- a/hw/arm/bcm2836.c
+++ b/hw/arm/bcm2836.c
@@ -64,6 +64,8 @@ static void bcm283x_init(Object *obj)
   "command-line");
 object_property_add_alias(obj, "vcram-size", OBJECT(>peripherals),
   "vcram-size");
+object_property_add_alias(obj, "vcram-base", OBJECT(>peripherals),
+  "vcram-base");
 }
 
 bool bcm283x_common_realize(DeviceState *dev, RaspiPeripheralBaseState *ps,
diff --git a/hw/arm/bcm2838.c b/hw/arm/bcm2838.c
index c687f38a39..a1980cc181 100644
--- a/hw/arm/bcm2838.c
+++ b/hw/arm/bcm2838.c
@@ -62,6 +62,8 @@ static void bcm2838_init(Object *obj)
   "board-rev");
 object_property_add_alias(obj, "vcram-size", OBJECT(>peripherals),
   "vcram-size");
+object_property_add_alias(obj, "vcram-base", OBJECT(>peripherals),
+  "vcram-base");
 object_property_add_alias(obj, "command-line", OBJECT(>peripherals),
   "command-line");
 
diff --git a/hw/arm/meson.build b/hw/arm/meson.build
index 071819b527..768b2608c1 100644
--- a/hw/arm/meson.build
+++ b/hw/arm/meson.build
@@ -39,7 +39,7 @@ arm_ss.add(when: 'CONFIG_ALLWINNER_A10', if_true: 
files('allwinner-a10.c', 'cubi
 arm_ss.add(when: 'CONFIG_ALLWINNER_H3', if_true: files('allwinner-h3.c', 
'orangepi.c'))
 arm_ss.add(when: 'CONFIG_ALLWINNER_R40', if_true: files('allwinner-r40.c', 
'bananapi_m2u.c'))
 arm_ss.add(when: 'CONFIG_RASPI', if_true: files('bcm2836.c', 'raspi.c'))
-arm_ss.add(when: ['CONFIG_RASPI', 'TARGET_AARCH64'], if_true: 
files('bcm2838.c'))
+arm_ss.add(when: ['CONFIG_RASPI', 'TARGET_AARCH64'], if_true: 
files('bcm2838.c', 'raspi4b.c'))
 arm_ss.add(when: 'CONFIG_STM32F100_SOC', if_true: files('stm32f100_soc.c'))
 arm_ss.add(when: 'CONFIG_STM32F205_SOC', if_true: files('stm32f205_soc.c'))
 arm_ss.add(when: 'CONFIG_STM32F405_SOC', if_true: files('stm32f405_soc.c'))
diff --git a/hw/arm/raspi.c b/hw/arm/raspi.c
index 7d04734cd2..da1e9e7c13 100644
--- a/hw/arm/raspi.c
+++ b/hw/arm/raspi.c
@@ -18,6 +18,7 @@
 #include "qapi/error.h"
 #include "hw/arm/boot.h"
 #include "hw/arm/bcm2836.h"
+#include "hw/arm/bcm2838.h"
 #include "hw/arm/raspi_platform.h"
 #include "hw/registerfields.h"
 #include "qemu/error-report.h"
@@ -61,6 +62,7 @@ typedef enum RaspiProcessorId {
 PROCESSOR_ID_BCM2835 = 0,
 PROCESSOR_ID_BCM2836 = 1,
 PROCESSOR_ID_BCM2837 = 2,
+PROCESSOR_ID_BCM2838 = 3,
 } RaspiProcessorId;
 
 static const struct {
@@ -70,13 +72,9 @@ static const struct {

Re: [virtio-dev] [RFC PATCH] docs/interop: define STANDALONE protocol feature for vhost-user

2023-07-26 Thread Erik Schilling

On Tue Jul 4, 2023 at 4:54 PM CEST, Stefano Garzarella wrote:
> On Tue, Jul 04, 2023 at 01:36:00PM +0100, Alex Bennée wrote:
> >Currently QEMU has to know some details about the back-end to be able
> >to setup the guest. While various parts of the setup can be delegated
> >to the backend (for example config handling) this is a very piecemeal
> >approach.
> >
> >This patch suggests a new feature flag (VHOST_USER_PROTOCOL_F_STANDALONE)
> >which the back-end can advertise which allows a probe message to be
> >sent to get all the details QEMU needs to know in one message.
> >
> >Signed-off-by: Alex Bennée 
> >
> >---
> >Initial RFC for discussion. I intend to prototype this work with QEMU
> >and one of the rust-vmm vhost-user daemons.
>
> Thanks for starting this discussion!
>
> I'm comparing with vhost-vdpa IOCTLs, so my questions may be
> superficial, but they help me understand the differences.
>
> >---
> > docs/interop/vhost-user.rst | 37 +
> > hw/virtio/vhost-user.c  |  8 
> > 2 files changed, 45 insertions(+)
> >
> >diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> >index 5a070adbc1..85b1b1583a 100644
> >--- a/docs/interop/vhost-user.rst
> >+++ b/docs/interop/vhost-user.rst
> >@@ -275,6 +275,21 @@ Inflight description
> >
> > :queue size: a 16-bit size of virtqueues
> >
> >+Backend specifications
> >+^^
> >+
> >++---+-+++
> >+| device id | config size |   min_vqs  |   max_vqs  |
> >++---+-+++
> >+
> >+:device id: a 32-bit value holding the VirtIO device ID
> >+
> >+:config size: a 32-bit value holding the config size (see 
> >``VHOST_USER_GET_CONFIG``)
> >+
> >+:min_vqs: a 32-bit value holding the minimum number of vqs supported
>
> Why do we need the minimum?
>
> >+
> >+:max_vqs: a 32-bit value holding the maximum number of vqs supported, must 
> >be >= min_vqs
>
> Is this overlap with VHOST_USER_GET_QUEUE_NUM?

While fiddeling with a rust-vmm implementation of this I wondered:

Would a standalone daemon even need VHOST_USER_PROTOCOL_F_MQ if
VHOST_USER_PROTOCOL_F_CONFIG is required anyway? It looks like
all full virtio devices provide config information that allows to
derive the exact or maximum number of queues already. So wouldn't
VHOST_USER_PROTOCOL_F_MQ just provide a second, redundant way to get
the information? (And this would be a third?)

Am I missing something here? Otherwise, I think we could drop dependency
on VHOST_USER_PROTOCOL_F_MQ and get the info from the device config for
standalone daemons?

- Erik

Re: How to tame CI?

2023-07-26 Thread Juan Quintela

Daniel P. Berrangé  wrote:
> On Wed, Jul 26, 2023 at 02:00:03PM +0100, Peter Maydell wrote:
>> On Wed, 26 Jul 2023 at 13:06, Juan Quintela  wrote:
>> > To make things easier, this is the part that show how it breaks (this is
>> > the gcov test):
>> >
>> > 357/423 qemu:block / io-qcow2-copy-before-write
>> > ERROR   6.38s   exit status 1
>> > >>> PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3
>> > MALLOC_PERTURB_=44
>> > /builds/juan.quintela/qemu/build/pyvenv/bin/python3
>> > /builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap
>> > -qcow2 copy-before-write --source-dir
>> > /builds/juan.quintela/qemu/tests/qemu-iotests --build-dir
>> > /builds/juan.quintela/qemu/build/tests/qemu-iotests
>> > ― ✀  
>> > ―
>> > stderr:
>> > --- 
>> > /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out
>> > +++ 
>> > /builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad
>> > @@ -1,5 +1,21 @@
>> > -
>> > +...F
>> > +==
>> > +FAIL: test_timeout_break_snapshot (__main__.TestCbwError)
>> > +--
>> > +Traceback (most recent call last):
>> > +  File 
>> > "/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", 
>> > line 210, in test_timeout_break_snapshot
>> > +self.assertEqual(log, """\
>> > +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at
>> > offset 0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed:
>> > Permission denied\n'
>> > +  wrote 524288/524288 bytes at offset 0
>> > +  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> > +  wrote 524288/524288 bytes at offset 524288
>> > +  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> > ++ read failed: Permission denied
>> > +- read 1048576/1048576 bytes at offset 0
>> > +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>> > +
>> 
>> This iotest failing is an intermittent that I've seen running
>> pullreqs on master. I tend to see it on the s390 host. I
>> suspect a race condition somewhere where it fails if the host
>> is heavily loaded.

What is weird to me is that I was unable to reproduce it on the previous
commit.  But with this one happened always.  No, I have no clue why, and
as said, it makes zero sense, it is for a binary that it is not used on
the block test.

Later, Juan.

>
> Since it is known flakey, we should just commit the change
>
> --- a/tests/qemu-iotests/tests/copy-before-write
> +++ b/tests/qemu-iotests/tests/copy-before-write
> @@ -1,5 +1,5 @@
>  #!/usr/bin/env python3
> -# group: auto backup
> +# group: backup
>  #
>  # Copyright (c) 2022 Virtuozzo International GmbH
>  #
>
>
> and if someone wants to re-enable it, they get the job of fixing its
> reliability first.
>
> With regards,
> Daniel

Re: [RFC PATCH] target/i386: Truncate ESP when exiting from long mode

2023-07-26 Thread Richard Henderson


On 7/26/23 01:17, Ard Biesheuvel wrote:

While working on some EFI boot changes for Linux/x86, I noticed that TCG 
deviates from
bare metal when it comes to how it handles the value of the stack pointer 
register RSP
when dropping out of long mode.

On bare metal, RSP is truncated to 32 bits, even if the code that runs in 32-bit
protected mode never uses the stack at all (and uses a long jump rather than 
long
return to switch back to long mode). This means 64-bit code cannot rely on RSP
surviving any excursions into 32-bit protected mode (with paging disabled).

Let's align TCG with this behavior, so that code that relies on RSP retaining 
its value
does not inadvertently work while bare metal does not.

Observed on Intel Ice Lake cores.

Cc: Paolo Bonzini Cc: Richard
Henderson Cc: Eduardo Habkost 
Link:https://lore.kernel.org/all/20230711091453.2543622-11-a...@kernel.org/ 
Signed-off-by: Ard Biesheuvel --- I used this patch locally to

reproduce an issue that was reported on Ice Lake but didn't trigger in my QEMU
testing.

Hints welcome on where the architectural behavior is specified, and in 
particular,
whether or not other 64-bit GPRs can be relied upon to preserve their full 
64-bit
length values.


No idea about chapter and verse, but it has the feel of being part and parcel 
with the
truncation of eip.  While esp is always special, I suspect that none of the GPRs can be 
relied on carrying all bits.


I'm happy with the change though, since similar behaviour can be observed on hw.

Acked-by: Richard Henderson 


r~

Re: How to tame CI?

2023-07-26 Thread Thomas Huth


On 26/07/2023 15.00, Peter Maydell wrote:

On Wed, 26 Jul 2023 at 13:06, Juan Quintela  wrote:

To make things easier, this is the part that show how it breaks (this is
the gcov test):

357/423 qemu:block / io-qcow2-copy-before-write
ERROR   6.38s   exit status 1

PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3 MALLOC_PERTURB_=44 
/builds/juan.quintela/qemu/build/pyvenv/bin/python3 
/builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap -qcow2 
copy-before-write --source-dir /builds/juan.quintela/qemu/tests/qemu-iotests 
--build-dir /builds/juan.quintela/qemu/build/tests/qemu-iotests

― ✀  ―
stderr:
--- /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out
+++ 
/builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad
@@ -1,5 +1,21 @@
-
+...F
+==
+FAIL: test_timeout_break_snapshot (__main__.TestCbwError)
+--
+Traceback (most recent call last):
+  File 
"/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", line 
210, in test_timeout_break_snapshot
+self.assertEqual(log, """\
+AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at offset 0\n1 
MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: Permission denied\n'
+  wrote 524288/524288 bytes at offset 0
+  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+  wrote 524288/524288 bytes at offset 524288
+  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
++ read failed: Permission denied
+- read 1048576/1048576 bytes at offset 0
+- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+


This iotest failing is an intermittent that I've seen running
pullreqs on master. I tend to see it on the s390 host. I
suspect a race condition somewhere where it fails if the host
is heavily loaded.


It's obviously a failure in an iotest, so let's CC: the corresponding people 
(done now).


 Thomas

[PATCH 44/44] Append added properties to mailbox test

2023-07-26 Thread Sergey Kambalin

Signed-off-by: Sergey Kambalin 
---
 tests/qtest/bcm2838-mbox-property-test.c | 82 
 1 file changed, 82 insertions(+)

diff --git a/tests/qtest/bcm2838-mbox-property-test.c 
b/tests/qtest/bcm2838-mbox-property-test.c
index e833529a00..68e2b11db6 100644
--- a/tests/qtest/bcm2838-mbox-property-test.c
+++ b/tests/qtest/bcm2838-mbox-property-test.c
@@ -242,6 +242,12 @@ DECLARE_TEST_CASE_SETUP(GET_MIN_CLOCK_RATE, ANY) {
 tag->request.value.clock_id = CLOCK_ID_UNDEFINED;
 }
 
+/**/
+DECLARE_TEST_CASE(GET_CLOCKS) {
+g_assert_cmphex(tag->response.value.root_clock, ==, CLOCK_ID_ROOT);
+g_assert_cmphex(tag->response.value.arm_clock, ==, CLOCK_ID_ARM);
+}
+
 
/**/
 DECLARE_TEST_CASE(GET_TEMPERATURE) {
 g_assert_cmphex(tag->response.value.temperature_id, ==, 
TEMPERATURE_ID_SOC);
@@ -508,11 +514,31 @@ DECLARE_TEST_CASE(GET_COMMANDLINE) {
 /* No special checks are needed for this test case */
 }
 
+/**/
+DECLARE_TEST_CASE(GET_THROTTLED) {
+g_assert_cmpint(tag->response.value.throttled, ==, 0);
+}
+
 
/**/
 DECLARE_TEST_CASE(GET_NUM_DISPLAYS) {
 g_assert_cmpint(tag->response.value.num_displays, ==, 1);
 }
 
+/**/
+DECLARE_TEST_CASE(GET_DISPLAY_SETTINGS) {
+g_assert_cmpint(tag->response.value.display_num, ==, 0);
+g_assert_cmpint(tag->response.value.phys_width, ==, 800);
+g_assert_cmpint(tag->response.value.phys_height, ==, 600);
+g_assert_cmpint(tag->response.value.bpp, ==, 32);
+g_assert_cmpint(tag->response.value.pitch, ==, 32);
+g_assert_cmpint(tag->response.value.virt_width, ==, 0);
+g_assert_cmpint(tag->response.value.virt_height, ==, 0);
+g_assert_cmpint(tag->response.value.virt_width_offset, ==, 0);
+g_assert_cmpint(tag->response.value.virt_height_offset, ==, 0);
+g_assert_cmphex(tag->response.value.fb_bus_address_lo, ==, 0x);
+g_assert_cmphex(tag->response.value.fb_bus_address_hi, ==, 0x);
+}
+
 
/**/
 DECLARE_TEST_CASE(SET_PITCH) {
 /* No special checks are needed for this test case */
@@ -521,6 +547,54 @@ DECLARE_TEST_CASE_SETUP(SET_PITCH) {
 tag->request.value.pitch = DUMMY_VALUE;
 }
 
+/**/
+DECLARE_TEST_CASE(GET_GPIO_CONFIG) {
+g_assert_cmpint(tag->response.value.zero, ==, 0);
+g_assert_cmphex(tag->response.value.direction, ==, GPIO_DIRECTION_IN);
+g_assert_cmphex(tag->response.value.polarity, ==, GPIO_POLARITY_LOW);
+g_assert_cmphex(tag->response.value.term_en, ==, 
GPIO_TERMINATION_DISABLED);
+g_assert_cmphex(tag->response.value.term_pull_up, ==, 
GPIO_TERMINATION_PULLUP_DISABLED);
+}
+DECLARE_TEST_CASE_SETUP(GET_GPIO_CONFIG) {
+tag->request.value.gpio_num = GPIO_0;
+}
+
+/**/
+DECLARE_TEST_CASE(SET_GPIO_CONFIG) {
+g_assert_cmpint(tag->response.value.zero, ==, 0);
+}
+DECLARE_TEST_CASE_SETUP(SET_GPIO_CONFIG) {
+tag->request.value.gpio_num = GPIO_0;
+tag->request.value.direction = DUMMY_VALUE;
+tag->request.value.polarity = DUMMY_VALUE;
+tag->request.value.term_en = DUMMY_VALUE;
+tag->request.value.term_pull_up = DUMMY_VALUE;
+tag->request.value.state = DUMMY_VALUE;
+}
+
+/**/
+DECLARE_TEST_CASE(GET_GPIO_STATE) {
+g_assert_cmpint(tag->response.value.zero, ==, 0);
+g_assert_cmphex(tag->response.value.state, ==, GPIO_STATE_DOWN);
+}
+DECLARE_TEST_CASE_SETUP(GET_GPIO_STATE) {
+tag->request.value.gpio_num = GPIO_0;
+}
+
+/**/
+DECLARE_TEST_CASE(SET_GPIO_STATE) {
+g_assert_cmpint(tag->response.value.zero, ==, 0);
+}
+DECLARE_TEST_CASE_SETUP(SET_GPIO_STATE) {
+tag->request.value.gpio_num = GPIO_0;
+tag->request.value.state = GPIO_STATE_DOWN;
+}
+
+/**/
+DECLARE_TEST_CASE(INITIALIZE_VCHIQ) {
+g_assert_cmpint(tag->response.value.zero, ==, 0);
+}
+
 
/**/
 int main(int argc, char **argv)
 {
@@ -544,6 +618,7 @@ int main(int argc, char **argv)
 QTEST_ADD_TEST_CASE(GET_CLOCK_RATE, ANY);
 QTEST_ADD_TEST_CASE(GET_MAX_CLOCK_RATE, ANY);
 QTEST_ADD_TEST_CASE(GET_MIN_CLOCK_RATE, ANY);
+QTEST_ADD_TEST_CASE(GET_CLOCKS);
 QTEST_ADD_TEST_CASE(GET_TEMPERATURE);
 QTEST_ADD_TEST_CASE(GET_MAX_TEMPERATURE);

[PATCH v10 10/10] migration: modify test_multifd_tcp_none() to use new QAPI syntax.

2023-07-26 Thread Het Gala

modify multifd tcp common test to incorporate the new QAPI
syntax defined.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
Reviewed-by: Daniel P. Berrangé 
---
 tests/qtest/migration-test.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index e256da1216..376fad8311 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2185,7 +2185,12 @@ test_migrate_precopy_tcp_multifd_start_common(QTestState 
*from,
 
 /* Start incoming migration from the 1st socket */
 qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming',"
- "  'arguments': { 'uri': 'tcp:127.0.0.1:0' }}");
+ "  'arguments': { "
+ "  'channels': [ { 'channeltype': 'main',"
+ "  'addr': { 'transport': 'socket',"
+ "'type': 'inet',"
+ "'host': '127.0.0.1',"
+ "'port': '0' } } ] } }");
 
 return NULL;
 }
-- 
2.22.3

[PATCH v10 03/10] migration: convert socket backend to accept MigrateAddress

2023-07-26 Thread Het Gala

Socket transport backend for 'migrate'/'migrate-incoming' QAPIs accept
new wire protocol of MigrateAddress struct.

It is achived by parsing 'uri' string and storing migration parameters
required for socket connection into well defined SocketAddress struct.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
Reviewed-by: Daniel P. Berrangé 
---
 migration/migration.c | 30 ++
 migration/socket.c| 39 +--
 migration/socket.h|  7 ---
 3 files changed, 31 insertions(+), 45 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5b3be767f0..c4bcf8bbd7 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -481,18 +481,21 @@ static void qemu_start_incoming_migration(const char 
*uri, Error **errp)
 }
 
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
-if (strstart(uri, "tcp:", ) ||
-strstart(uri, "unix:", NULL) ||
-strstart(uri, "vsock:", NULL)) {
-socket_start_incoming_migration(p ? p : uri, errp);
+if (channel->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
+SocketAddress *saddr = >u.socket;
+if (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
+saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
+saddr->type == SOCKET_ADDRESS_TYPE_VSOCK) {
+socket_start_incoming_migration(saddr, errp);
+} else if (saddr->type == SOCKET_ADDRESS_TYPE_FD) {
+fd_start_incoming_migration(saddr->u.fd.str, errp);
+}
 #ifdef CONFIG_RDMA
 } else if (strstart(uri, "rdma:", )) {
 rdma_start_incoming_migration(p, errp);
 #endif
 } else if (strstart(uri, "exec:", )) {
 exec_start_incoming_migration(p, errp);
-} else if (strstart(uri, "fd:", )) {
-fd_start_incoming_migration(p, errp);
 } else {
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
@@ -1715,18 +1718,21 @@ void qmp_migrate(const char *uri, bool has_blk, bool 
blk,
 }
 }
 
-if (strstart(uri, "tcp:", ) ||
-strstart(uri, "unix:", NULL) ||
-strstart(uri, "vsock:", NULL)) {
-socket_start_outgoing_migration(s, p ? p : uri, _err);
+if (channel->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
+SocketAddress *saddr = >u.socket;
+if (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
+saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
+saddr->type == SOCKET_ADDRESS_TYPE_VSOCK) {
+socket_start_outgoing_migration(s, saddr, _err);
+} else if (saddr->type == SOCKET_ADDRESS_TYPE_FD) {
+fd_start_outgoing_migration(s, saddr->u.fd.str, _err);
+}
 #ifdef CONFIG_RDMA
 } else if (strstart(uri, "rdma:", )) {
 rdma_start_outgoing_migration(s, p, _err);
 #endif
 } else if (strstart(uri, "exec:", )) {
 exec_start_outgoing_migration(s, p, _err);
-} else if (strstart(uri, "fd:", )) {
-fd_start_outgoing_migration(s, p, _err);
 } else {
 if (!resume_requested) {
 yank_unregister_instance(MIGRATION_YANK_INSTANCE);
diff --git a/migration/socket.c b/migration/socket.c
index 1b6f5baefb..98e3ea1514 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -28,6 +28,8 @@
 #include "trace.h"
 #include "postcopy-ram.h"
 #include "options.h"
+#include "qapi/clone-visitor.h"
+#include "qapi/qapi-visit-sockets.h"
 
 struct SocketOutgoingArgs {
 SocketAddress *saddr;
@@ -108,19 +110,19 @@ out:
 object_unref(OBJECT(sioc));
 }
 
-static void
-socket_start_outgoing_migration_internal(MigrationState *s,
- SocketAddress *saddr,
- Error **errp)
+void socket_start_outgoing_migration(MigrationState *s,
+ SocketAddress *saddr,
+ Error **errp)
 {
 QIOChannelSocket *sioc = qio_channel_socket_new();
 struct SocketConnectData *data = g_new0(struct SocketConnectData, 1);
+SocketAddress *addr = QAPI_CLONE(SocketAddress, saddr);
 
 data->s = s;
 
 /* in case previous migration leaked it */
 qapi_free_SocketAddress(outgoing_args.saddr);
-outgoing_args.saddr = saddr;
+outgoing_args.saddr = addr;
 
 if (saddr->type == SOCKET_ADDRESS_TYPE_INET) {
 data->hostname = g_strdup(saddr->u.inet.host);
@@ -135,18 +137,6 @@ socket_start_outgoing_migration_internal(MigrationState *s,
  NULL);
 }
 
-void socket_start_outgoing_migration(MigrationState *s,
- const char *str,
- Error **errp)
-{
-Error *err = NULL;
-SocketAddress *saddr = socket_parse(str, );
-if (!err) {
-socket_start_outgoing_migration_internal(s, saddr, );
-}
-error_propagate(errp, err);
-}
-
 static void socket_accept_incoming_migration(QIONetListener *listener,

[PATCH 27/44] Add GENET register access macros

2023-07-26 Thread Sergey Kambalin

Signed-off-by: Sergey Kambalin 
---
 include/hw/net/bcm2838_genet.h | 76 ++
 1 file changed, 76 insertions(+)

diff --git a/include/hw/net/bcm2838_genet.h b/include/hw/net/bcm2838_genet.h
index 4b549ed431..bfe5e3ab31 100644
--- a/include/hw/net/bcm2838_genet.h
+++ b/include/hw/net/bcm2838_genet.h
@@ -22,9 +22,85 @@ OBJECT_DECLARE_SIMPLE_TYPE(BCM2838GenetState, BCM2838_GENET)
 #define BCM2838_GENET_DMA_RING_CNT  17
 #define BCM2838_GENET_DMA_RING_DEFAULT  (BCM2838_GENET_DMA_RING_CNT - 1)
 
+#define BCM2838_GENET_HFB_FILTER_REGS offsetof(BCM2838GenetRegs, hfb)
+#define BCM2838_GENET_HFB_FILTER_REG(reg) (BCM2838_GENET_HFB_FILTER_REGS \
+   + offsetof(BCM2838GenetRegsHfb, 
reg))
 #define BCM2838_GENET_HFB_FILTER_CNT  48
 #define BCM2838_GENET_HFB_FILTER_SIZE 128
 
+#define BCM2838_GENET_INTRL0_REG(reg)   (offsetof(BCM2838GenetRegs, intrl0) \
++ offsetof(BCM2838GenetRegsIntrl0, 
reg))
+#define BCM2838_GENET_INTRL0_SETBCM2838_GENET_INTRL0_REG(set)
+#define BCM2838_GENET_INTRL0_CLEAR  BCM2838_GENET_INTRL0_REG(clear)
+#define BCM2838_GENET_INTRL0_MASK_SET   BCM2838_GENET_INTRL0_REG(mask_set)
+#define BCM2838_GENET_INTRL0_MASK_CLEAR BCM2838_GENET_INTRL0_REG(mask_clear)
+
+#define BCM2838_GENET_INTRL1_REG(reg)   (offsetof(BCM2838GenetRegs, intrl1) \
++ offsetof(BCM2838GenetRegsIntrl1, 
reg))
+#define BCM2838_GENET_INTRL1_SETBCM2838_GENET_INTRL1_REG(set)
+#define BCM2838_GENET_INTRL1_CLEAR  BCM2838_GENET_INTRL1_REG(clear)
+#define BCM2838_GENET_INTRL1_MASK_SET   BCM2838_GENET_INTRL1_REG(mask_set)
+#define BCM2838_GENET_INTRL1_MASK_CLEAR BCM2838_GENET_INTRL1_REG(mask_clear)
+
+#define BCM2838_GENET_UMAC_REG(reg) (offsetof(BCM2838GenetRegs, umac) \
+ + offsetof(BCM2838GenetRegsUmac, reg))
+#define BCM2838_GENET_UMAC_CMD  BCM2838_GENET_UMAC_REG(cmd)
+#define BCM2838_GENET_UMAC_MAC0 BCM2838_GENET_UMAC_REG(mac0)
+#define BCM2838_GENET_UMAC_MAC1 BCM2838_GENET_UMAC_REG(mac1)
+#define BCM2838_GENET_UMAC_MDIO_CMD BCM2838_GENET_UMAC_REG(mdio_cmd)
+
+#define BCM2838_GENET_TDMA_REGS offsetof(BCM2838GenetRegs, tdma)
+#define BCM2838_GENET_TDMA_REG(reg) (BCM2838_GENET_TDMA_REGS \
+ + offsetof(BCM2838GenetRegsTdma, reg))
+#define BCM2838_GENET_TDMA_RINGSBCM2838_GENET_TDMA_REG(rings)
+#define BCM2838_GENET_TDMA_RING_CFG BCM2838_GENET_TDMA_REG(ring_cfg)
+#define BCM2838_GENET_TDMA_CTRL BCM2838_GENET_TDMA_REG(ctrl)
+
+#define BCM2838_GENET_RDMA_REGS offsetof(BCM2838GenetRegs, rdma)
+#define BCM2838_GENET_RDMA_REG(reg) (BCM2838_GENET_RDMA_REGS \
+ + offsetof(BCM2838GenetRegsRdma, reg))
+#define BCM2838_GENET_RDMA_RINGSBCM2838_GENET_RDMA_REG(rings)
+#define BCM2838_GENET_RDMA_RING_CFG BCM2838_GENET_RDMA_REG(ring_cfg)
+#define BCM2838_GENET_RDMA_CTRL BCM2838_GENET_RDMA_REG(ctrl)
+
+#define BCM2838_GENET_TRING_REG(reg)offsetof(BCM2838GenetTdmaRing, reg)
+#define BCM2838_GENET_TRING_WRITE_PTR BCM2838_GENET_TRING_REG(write_ptr)
+#define BCM2838_GENET_TRING_WRITE_PTR_HI BCM2838_GENET_TRING_REG(write_ptr_hi)
+#define BCM2838_GENET_TRING_PROD_INDEX BCM2838_GENET_TRING_REG(prod_index)
+#define BCM2838_GENET_TRING_CONS_INDEX BCM2838_GENET_TRING_REG(cons_index)
+#define BCM2838_GENET_TRING_RING_BUF_SIZE 
BCM2838_GENET_TRING_REG(ring_buf_size)
+#define BCM2838_GENET_TRING_RING_START_ADDR BCM2838_GENET_TRING_REG(start_addr)
+#define BCM2838_GENET_TRING_RING_START_ADDR_HI 
BCM2838_GENET_TRING_REG(start_addr_hi)
+#define BCM2838_GENET_TRING_RING_END_ADDR BCM2838_GENET_TRING_REG(end_addr)
+#define BCM2838_GENET_TRING_RING_END_ADDR_HI 
BCM2838_GENET_TRING_REG(end_addr_hi)
+#define BCM2838_GENET_TRING_RING_MBUF_DONE_TRESH 
BCM2838_GENET_TRING_REG(mbuf_done_tresh)
+#define BCM2838_GENET_TRING_RING_FLOW_PERIOD 
BCM2838_GENET_TRING_REG(flow_period)
+#define BCM2838_GENET_TRING_RING_READ_PTR BCM2838_GENET_TRING_REG(read_ptr)
+#define BCM2838_GENET_TRING_RING_READ_PTR_HI 
BCM2838_GENET_TRING_REG(read_ptr_hi)
+
+#define BCM2838_GENET_RRING_REG(reg)offsetof(BCM2838GenetRdmaRing, reg)
+#define BCM2838_GENET_RRING_WRITE_PTR BCM2838_GENET_RRING_REG(write_ptr)
+#define BCM2838_GENET_RRING_WRITE_PTR_HI BCM2838_GENET_RRING_REG(write_ptr_hi)
+#define BCM2838_GENET_RRING_PROD_INDEX BCM2838_GENET_RRING_REG(prod_index)
+#define BCM2838_GENET_RRING_CONS_INDEX BCM2838_GENET_RRING_REG(cons_index)
+#define BCM2838_GENET_RRING_RING_BUF_SIZE 
BCM2838_GENET_RRING_REG(ring_buf_size)
+#define BCM2838_GENET_RRING_RING_START_ADDR BCM2838_GENET_RRING_REG(start_addr)
+#define BCM2838_GENET_RRING_RING_START_ADDR_HI 
BCM2838_GENET_RRING_REG(start_addr_hi)
+#define BCM2838_GENET_RRING_RING_END_ADDR BCM2838_GENET_RRING_REG(end_addr)
+#define BCM2838_GENET_RRING_RING_END_ADDR_HI

Re: [PATCH] migration: Allow user to specify migration available bandwidth

2023-07-26 Thread Peter Xu

On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
> Peter Xu  writes:
> 
> > Hi, Markus,
> >
> > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
> 
> [...]
> 
> >> For better or worse, we duplicate full documentation between
> >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
> >> would be the first instance where we reference instead.  I'm not opposed
> >> to use references, but if we do, I want them used consistently.
> >
> > We discussed this over the other "switchover" parameter, but that patchset
> > just stranded..
> >
> > Perhaps I just provide a pre-requisite patch to remove all the comments in
> > MigrateSetParameters and MigrationParameters, letting them all point to
> > MigrationParameter?
> 
> Simplifies maintaining the doc commments.  But how does it affect the
> documentation generated from it?  Better, neutral, or worse?

Probably somewhere neutral.  There are definitely benefits, shorter
man/html page in total, and avoid accidentally different docs over the same
fields.  E.g., we sometimes have different wordings for different objects:

   max-cpu-throttle
  maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)

   max-cpu-throttle: int (optional)
  maximum cpu throttle percentage.  The default value is 99. (Since 
3.1)

This one is fine, but it's just very easy to leak in something that shows
differently.  It's good to provide coherent documentation for the same
fields over all three objects.

When looking at qemu-qmp-ref.7, it can be like this when we can invite the
reader to read the other section (assuming we only keep MigrationParameter
to keep the documentations):

   MigrationParameters (Object)

   The object structure to represent a list of migration parameters.
   The optional members aren't actually optional.  For detailed
   explanation for each of the field, please refer to the documentation
   of MigrationParameter.

But the problem is we always will generate the Members entry, where now
it'll all filled up with "Not documented"..

   Members
   announce-initial: int (optional)
  Not documented

   announce-max: int (optional)
  Not documented

   announce-rounds: int (optional)
  Not documented

   [...]

I think maybe it's better we just list the members without showing "Not
documented" every time for the other two objects.  Not sure whether it's an
easy way to fix it, or is it a major issue.

For developers, dedup the comment should always be a win, afaict.

Thanks,

-- 
Peter Xu

Re: [PATCH v10 09/10] migration: Implement MigrateChannelList to hmp migration flow.

2023-07-26 Thread Daniel P . Berrangé

On Wed, Jul 26, 2023 at 02:18:32PM +, Het Gala wrote:
> Integrate MigrateChannelList with all transport backends
> (socket, exec and rdma) for both src and dest migration
> endpoints for hmp migration.
> 
> Suggested-by: Aravind Retnakaran 
> Signed-off-by: Het Gala 
> ---
>  migration/migration-hmp-cmds.c | 15 +--
>  migration/migration.c  |  5 ++---
>  migration/migration.h  |  3 ++-
>  3 files changed, 17 insertions(+), 6 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH 26/44] Add GENET register structs. Part 4

2023-07-26 Thread Sergey Kambalin

Signed-off-by: Sergey Kambalin 
---
 include/hw/net/bcm2838_genet.h | 40 ++
 1 file changed, 40 insertions(+)

diff --git a/include/hw/net/bcm2838_genet.h b/include/hw/net/bcm2838_genet.h
index 4cf70a17d3..4b549ed431 100644
--- a/include/hw/net/bcm2838_genet.h
+++ b/include/hw/net/bcm2838_genet.h
@@ -25,6 +25,15 @@ OBJECT_DECLARE_SIMPLE_TYPE(BCM2838GenetState, BCM2838_GENET)
 #define BCM2838_GENET_HFB_FILTER_CNT  48
 #define BCM2838_GENET_HFB_FILTER_SIZE 128
 
+#define BCM2838_GENET_PHY_AUX_CTL_MISC  0x7
+#define BCM2838_GENET_PHY_AUX_CTL_REGS_SIZE 8
+
+#define SIZEOF_FIELD(type, field)  sizeof(((type*) 0)->field)
+#define BCM2838_GENET_PHY_EXP_SHD_BLOCKS_CNT \
+(1u << (8 * SIZEOF_FIELD(BCM2838GenetPhyExpSel, block_id)))
+#define BCM2838_GENET_PHY_EXP_SHD_REGS_CNT \
+(1u << (8 * SIZEOF_FIELD(BCM2838GenetPhyExpSel, reg_id)))
+
 typedef union {
 uint32_t value;
 struct {
@@ -568,6 +577,34 @@ typedef struct {
 uint16_t rdb_data;
 } __attribute__((__packed__)) BCM2838GenetPhyRegs;
 
+typedef struct {
+uint16_t reserved_0_2[3];
+uint16_t clk_ctl;
+uint16_t scr2;
+uint16_t scr3;
+uint16_t reserved_6_9[4];
+uint16_t apd;
+uint16_t rgmii_mode;
+uint16_t reserved_12;
+uint16_t leds1;
+uint16_t reserved_14_18[5];
+uint16_t _100fx_ctrl;
+uint16_t ssd;
+uint16_t reserved_21_30[10];
+uint16_t mode;
+} __attribute__((__packed__)) BCM2838GenetPhyShdRegs;
+
+typedef struct {
+uint16_t auxctl;
+uint16_t reserved_1_6[BCM2838_GENET_PHY_AUX_CTL_REGS_SIZE - 2];
+uint16_t misc;
+} __attribute__((__packed__)) BCM2838GenetPhyAuxShdRegs;
+
+typedef struct {
+uint16_t regs[BCM2838_GENET_PHY_EXP_SHD_BLOCKS_CNT]
+ [BCM2838_GENET_PHY_EXP_SHD_REGS_CNT];
+} __attribute__((__packed__)) BCM2838GenetPhyExpShdRegs;
+
 struct BCM2838GenetState {
 /*< private >*/
 SysBusDevice parent_obj;
@@ -579,6 +616,9 @@ struct BCM2838GenetState {
 
 BCM2838GenetRegs regs;
 BCM2838GenetPhyRegs phy_regs;
+BCM2838GenetPhyShdRegs phy_shd_regs;
+BCM2838GenetPhyAuxShdRegs phy_aux_ctl_shd_regs;
+BCM2838GenetPhyExpShdRegs phy_exp_shd_regs;
 
 qemu_irq irq_default;
 qemu_irq irq_prio;
-- 
2.34.1

[PATCH 18/44] Add RNG200 RNG and RBG

2023-07-26 Thread Sergey Kambalin

Signed-off-by: Sergey Kambalin 
---
 hw/misc/bcm2838_rng200.c | 218 ++-
 1 file changed, 217 insertions(+), 1 deletion(-)

diff --git a/hw/misc/bcm2838_rng200.c b/hw/misc/bcm2838_rng200.c
index a17e8f2cda..bfc40658e2 100644
--- a/hw/misc/bcm2838_rng200.c
+++ b/hw/misc/bcm2838_rng200.c
@@ -8,23 +8,194 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/log.h"
 #include "qapi/error.h"
 #include "hw/qdev-properties.h"
 #include "hw/misc/bcm2838_rng200.h"
 #include "trace.h"
 
+#define RNG_CTRL_OFFSET  0x00
+#define RNG_SOFT_RESET   0x01
+#define RNG_SOFT_RESET_OFFSET0x04
+#define RBG_SOFT_RESET_OFFSET0x08
+#define RNG_TOTAL_BIT_COUNT_OFFSET   0x0C
+#define RNG_TOTAL_BIT_COUNT_THRESHOLD_OFFSET 0x10
+#define RNG_INT_STATUS_OFFSET0x18
+#define RNG_INT_ENABLE_OFFSET0x1C
+#define RNG_FIFO_DATA_OFFSET 0x20
+#define RNG_FIFO_COUNT_OFFSET0x24
+
+#define RNG_WARM_UP_PERIOD_ELAPSED   17
+
+#define BCM2838_RNG200_PTIMER_POLICY (PTIMER_POLICY_CONTINUOUS_TRIGGER)
+
+static void bcm2838_rng200_update_irq(BCM2838Rng200State *state)
+{
+qemu_set_irq(state->irq, !!(state->rng_int_enable.value
+  & state->rng_int_status.value));
+}
+
+static void bcm2838_rng200_update_fifo(void *opaque, const void *buf,
+   size_t size)
+{
+BCM2838Rng200State *state = (BCM2838Rng200State *)opaque;
+Fifo8 *fifo = >fifo;
+size_t num = MIN(size, fifo8_num_free(fifo));
+uint32_t num_bits = num * 8;
+uint32_t bit_threshold_left = 0;
+
+state->rng_total_bit_count += num_bits;
+if (state->rng_bit_count_threshold > state->rng_total_bit_count) {
+bit_threshold_left =
+state->rng_bit_count_threshold - state->rng_total_bit_count;
+} else {
+bit_threshold_left = 0;
+}
+
+if (bit_threshold_left < num_bits) {
+num_bits -= bit_threshold_left;
+} else {
+num_bits = 0;
+}
+
+num = num_bits / 8;
+if ((num == 0) && (num_bits > 0)) {
+num = 1;
+}
+if (num > 0) {
+fifo8_push_all(fifo, buf, num);
+
+if (fifo8_num_used(fifo) > state->rng_fifo_count.thld) {
+state->rng_int_status.total_bits_count_irq = 1;
+}
+}
+
+state->rng_fifo_count.count = fifo8_num_used(fifo) >> 2;
+bcm2838_rng200_update_irq(state);
+trace_bcm2838_rng200_update_fifo(num, fifo8_num_used(fifo));
+}
+
+static void bcm2838_rng200_fill_fifo(BCM2838Rng200State *state)
+{
+rng_backend_request_entropy(state->rng,
+fifo8_num_free(>fifo),
+bcm2838_rng200_update_fifo, state);
+}
+
+/* state is temporary unused */
+static void bcm2838_rng200_disable_rbg(BCM2838Rng200State *state
+   __attribute__((unused)))
+{
+trace_bcm2838_rng200_disable_rbg();
+}
+
+static void bcm2838_rng200_enable_rbg(BCM2838Rng200State *state)
+{
+state->rng_total_bit_count = RNG_WARM_UP_PERIOD_ELAPSED;
+
+bcm2838_rng200_fill_fifo(state);
+
+trace_bcm2838_rng200_enable_rbg();
+}
+
 static void bcm2838_rng200_rng_reset(BCM2838Rng200State *state)
 {
 state->rng_ctrl.value = 0;
+state->rng_total_bit_count = 0;
+state->rng_bit_count_threshold = 0;
+state->rng_fifo_count.value = 0;
+state->rng_int_status.value = 0;
+state->rng_int_status.startup_transition_met_irq = 1;
+state->rng_int_enable.value = 0;
+fifo8_reset(>fifo);
 
 trace_bcm2838_rng200_rng_soft_reset();
 }
 
+static void bcm2838_rng200_rbg_reset(BCM2838Rng200State *state)
+{
+trace_bcm2838_rng200_rbg_soft_reset();
+}
+
+static uint32_t bcm2838_rng200_read_fifo_data(BCM2838Rng200State *state)
+{
+Fifo8 *fifo = >fifo;
+const uint8_t *buf;
+uint32_t ret = 0;
+uint32_t num = 0;
+uint32_t max = MIN(fifo8_num_used(fifo), sizeof(ret));
+
+if (max > 0) {
+buf = fifo8_pop_buf(fifo, max, );
+if ((buf != NULL) && (num > 0)) {
+memcpy(, buf, num);
+}
+} else {
+qemu_log_mask(
+LOG_GUEST_ERROR,
+"bcm2838_rng200_read_fifo_data: FIFO is empty\n"
+);
+}
+
+state->rng_fifo_count.count = fifo8_num_used(fifo) >> 2;
+bcm2838_rng200_fill_fifo(state);
+
+return ret;
+}
+
+static void bcm2838_rng200_ctrl_write(BCM2838Rng200State *s, uint64_t value)
+{
+bool rng_enable = s->rng_ctrl.rbg_enable;
+
+s->rng_ctrl.value = value;
+if (!s->rng_ctrl.rbg_enable && rng_enable) {
+bcm2838_rng200_disable_rbg(s);
+} else if (s->rng_ctrl.rbg_enable && !rng_enable) {
+bcm2838_rng200_enable_rbg(s);
+}
+}
+
 static uint64_t bcm2838_rng200_read(void *opaque, hwaddr offset,
 unsigned size)
 {
+BCM2838Rng200State *s = (BCM2838Rng200State *)opaque;

Re: [PATCH v10 06/10] migration: New migrate and migrate-incoming argument 'channels'

2023-07-26 Thread Daniel P . Berrangé

On Wed, Jul 26, 2023 at 02:18:29PM +, Het Gala wrote:
> MigrateChannelList allows to connect accross multiple interfaces.
> Add MigrateChannelList struct as argument to migration QAPIs.
> 
> We plan to include multiple channels in future, to connnect
> multiple interfaces. Hence, we choose 'MigrateChannelList'
> as the new argument over 'MigrateChannel' to make migration
> QAPIs future proof.
> 
> Suggested-by: Aravind Retnakaran 
> Signed-off-by: Het Gala 
> Acked-by: Markus Armbruster 
> ---
>  migration/migration-hmp-cmds.c |   6 +-
>  migration/migration.c  |  56 +++--
>  qapi/migration.json| 109 -
>  softmmu/vl.c   |   2 +-
>  4 files changed, 161 insertions(+), 12 deletions(-)

Reviewed-by: Daniel P. Berrangé 


With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH v10 07/10] migration: modify migration_channels_and_uri_compatible() for new QAPI syntax

2023-07-26 Thread Het Gala

migration_channels_and_uri_compatible() check for transport mechanism
suitable for multifd migration gets executed when the caller calls old
uri syntax. It needs it to be run when using the modern MigrateChannel
QAPI syntax too.

After URI -> 'MigrateChannel' :
migration_channels_and_uri_compatible() ->
migration_channels_and_transport_compatible() passes object as argument
and check for valid transport mechanism.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
Reviewed-by: Daniel P. Berrangé 
---
 migration/migration.c | 25 ++---
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index be8cdf0fa5..5720c8ed4c 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -104,17 +104,20 @@ static bool migration_needs_multiple_sockets(void)
 return migrate_multifd() || migrate_postcopy_preempt();
 }
 
-static bool uri_supports_multi_channels(const char *uri)
+static bool transport_supports_multi_channels(SocketAddress *saddr)
 {
-return strstart(uri, "tcp:", NULL) || strstart(uri, "unix:", NULL) ||
-   strstart(uri, "vsock:", NULL);
+return saddr->type == SOCKET_ADDRESS_TYPE_INET ||
+   saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
+   saddr->type == SOCKET_ADDRESS_TYPE_VSOCK;
 }
 
 static bool
-migration_channels_and_uri_compatible(const char *uri, Error **errp)
+migration_channels_and_transport_compatible(MigrationAddress *addr,
+Error **errp)
 {
 if (migration_needs_multiple_sockets() &&
-!uri_supports_multi_channels(uri)) {
+(addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) &&
+!transport_supports_multi_channels(>u.socket)) {
 error_setg(errp, "Migration requires multi-channel URIs (e.g. tcp)");
 return false;
 }
@@ -493,12 +496,12 @@ static void qemu_start_incoming_migration(const char 
*uri, bool has_channels,
 return;
 }
 
-/* URI is not suitable for migration? */
-if (!migration_channels_and_uri_compatible(uri, errp)) {
+if (uri && !migrate_uri_parse(uri, , errp)) {
 return;
 }
 
-if (uri && !migrate_uri_parse(uri, , errp)) {
+/* transport mechanism not suitable for migration? */
+if (!migration_channels_and_transport_compatible(channel, errp)) {
 return;
 }
 
@@ -1740,12 +1743,12 @@ void qmp_migrate(const char *uri, bool has_channels,
 return;
 }
 
-/* URI is not suitable for migration? */
-if (!migration_channels_and_uri_compatible(uri, errp)) {
+if (!migrate_uri_parse(uri, , errp)) {
 return;
 }
 
-if (!migrate_uri_parse(uri, , errp)) {
+/* transport mechanism not suitable for migration? */
+if (!migration_channels_and_transport_compatible(channel, errp)) {
 return;
 }
 
-- 
2.22.3

[PATCH v10 00/10] migration: Modify 'migrate' and 'migrate-incoming' QAPI commands for migration

2023-07-26 Thread Het Gala

This is v10 patchset of modified 'migrate' and 'migrate-incoming' QAPI design
for upstream review.

Would like to thank all the maintainers that actively participated in the v9
patchset discussion and gave insightful suggestions to improve the patches.


Link to previous upstream community patchset links:
v1: https://lists.gnu.org/archive/html/qemu-devel/2022-12/msg04339.html
v2: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg02106.html
v3: https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg02473.html
v4: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg03064.html
v5: https://lists.gnu.org/archive/html/qemu-devel/2023-05/msg04845.html
v6: https://lists.gnu.org/archive/html/qemu-devel/2023-06/msg01251.html
v7: https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg02027.html
v8: https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg02770.html
v9: https://lists.gnu.org/archive/html/qemu-devel/2023-07/msg04216.html

v9 -> v10 changelog:
---
- Patch6 : Added extra checks for migration arguments.
- Patch8 : Added checks for 'uri' and 'channels' both not present.
- Patch9 : Missed adding hmp_handle_error call to print error messages.
Abstract:
-

Current QAPI 'migrate' command design (for initiating a migration
stream) contains information regarding different migrate transport mechanism
(tcp / unix / exec), dest-host IP address, and binding port number in form of
a string. Thus the design does seem to have some design issues. Some of the
issues, stated below are:

1. Use of string URIs is a data encoding scheme within a data encoding scheme.
   QEMU code should directly be able to work with the results from QAPI,
   without resorting to do a second level of parsing (eg. socket_parse()).
2. For features / parameters related to migration, the migration tunables needs
   to be defined and updated upfront. For example, 'migrate-set-capability'
   and 'migrate-set-parameter' is required to enable multifd capability and
   multifd-number of channels respectively. Instead, 'Multifd-channels' can
   directly be represented as a single additional parameter to 'migrate'
   QAPI. 'migrate-set-capability' and 'migrate-set-parameter' commands could
   be used for runtime tunables that need setting after migration has already
   started.

The current patchset focuses on solving the first problem of multi-level
encoding of URIs. The patch defines 'migrate' command as a QAPI discriminated
union for the various transport backends (like socket, exec and rdma), and on
basis of transport backends, different migration parameters are defined.

(uri) string -->  (channel) Channel-type
Transport-type
Migration parameters based on transport type
--

Het Gala (10):
  migration: New QAPI type 'MigrateAddress'
  migration: convert migration 'uri' into 'MigrateAddress'
  migration: convert socket backend to accept MigrateAddress
  migration: convert rdma backend to accept MigrateAddress
  migration: convert exec backend to accept MigrateAddress.
  migration: New migrate and migrate-incoming argument 'channels'
  migration: modify migration_channels_and_uri_compatible() for new QAPI
syntax
  migration: Implement MigrateChannelList to qmp migration flow.
  migration: Implement MigrateChannelList to hmp migration flow.
  migration: modify test_multifd_tcp_none() to use new QAPI syntax.

 migration/exec.c   |  72 +
 migration/exec.h   |   8 +-
 migration/migration-hmp-cmds.c |  17 ++-
 migration/migration.c  | 190 ++---
 migration/migration.h  |   3 +-
 migration/rdma.c   |  34 +++---
 migration/rdma.h   |   6 +-
 migration/socket.c |  39 ++-
 migration/socket.h |   7 +-
 qapi/migration.json| 150 +-
 softmmu/vl.c   |   2 +-
 tests/qtest/migration-test.c   |   7 +-
 12 files changed, 409 insertions(+), 126 deletions(-)

-- 
2.22.3

Re: How to tame CI?

2023-07-26 Thread Daniel P . Berrangé

On Wed, Jul 26, 2023 at 02:00:03PM +0100, Peter Maydell wrote:
> On Wed, 26 Jul 2023 at 13:06, Juan Quintela  wrote:
> > To make things easier, this is the part that show how it breaks (this is
> > the gcov test):
> >
> > 357/423 qemu:block / io-qcow2-copy-before-write
> > ERROR   6.38s   exit status 1
> > >>> PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3 
> > >>> MALLOC_PERTURB_=44 /builds/juan.quintela/qemu/build/pyvenv/bin/python3 
> > >>> /builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap 
> > >>> -qcow2 copy-before-write --source-dir 
> > >>> /builds/juan.quintela/qemu/tests/qemu-iotests --build-dir 
> > >>> /builds/juan.quintela/qemu/build/tests/qemu-iotests
> > ― ✀  
> > ―
> > stderr:
> > --- 
> > /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out
> > +++ 
> > /builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad
> > @@ -1,5 +1,21 @@
> > -
> > +...F
> > +==
> > +FAIL: test_timeout_break_snapshot (__main__.TestCbwError)
> > +--
> > +Traceback (most recent call last):
> > +  File 
> > "/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", 
> > line 210, in test_timeout_break_snapshot
> > +self.assertEqual(log, """\
> > +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at offset 0\n1 
> > MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: Permission denied\n'
> > +  wrote 524288/524288 bytes at offset 0
> > +  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > +  wrote 524288/524288 bytes at offset 524288
> > +  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > ++ read failed: Permission denied
> > +- read 1048576/1048576 bytes at offset 0
> > +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> > +
> 
> This iotest failing is an intermittent that I've seen running
> pullreqs on master. I tend to see it on the s390 host. I
> suspect a race condition somewhere where it fails if the host
> is heavily loaded.

Since it is known flakey, we should just commit the change

--- a/tests/qemu-iotests/tests/copy-before-write
+++ b/tests/qemu-iotests/tests/copy-before-write
@@ -1,5 +1,5 @@
 #!/usr/bin/env python3
-# group: auto backup
+# group: backup
 #
 # Copyright (c) 2022 Virtuozzo International GmbH
 #


and if someone wants to re-enable it, they get the job of fixing its
reliability first.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH v10 01/10] migration: New QAPI type 'MigrateAddress'

2023-07-26 Thread Het Gala

This patch introduces well defined MigrateAddress struct
and its related child objects.

The existing argument of 'migrate' and 'migrate-incoming' QAPI
- 'uri' is of type string. The current implementation follows
double encoding scheme for fetching migration parameters like
'uri' and this is not an ideal design.

Motive for intoducing struct level design is to prevent double
encoding of QAPI arguments, as Qemu should be able to directly
use the QAPI arguments without any level of encoding.

Note: this commit only adds the type, and actual uses comes
in later commits.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
Reviewed-by: Juan Quintela 
Reviewed-by: Daniel P. Berrangé 
Acked-by: Markus Armbruster 
---
 qapi/migration.json | 41 +
 1 file changed, 41 insertions(+)

diff --git a/qapi/migration.json b/qapi/migration.json
index 2a6565a0f8..6b97ce9633 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -1417,6 +1417,47 @@
 ##
 { 'command': 'migrate-continue', 'data': {'state': 'MigrationStatus'} }
 
+##
+# @MigrationAddressType:
+#
+# The migration stream transport mechanisms.
+#
+# @socket: Migrate via socket.
+#
+# @exec: Direct the migration stream to another process.
+#
+# @rdma: Migrate via RDMA.
+#
+# Since 8.2
+##
+{ 'enum': 'MigrationAddressType',
+  'data': ['socket', 'exec', 'rdma'] }
+
+##
+# @MigrationExecCommand:
+#
+# @args: command (list head) and arguments to execute.
+#
+# Since 8.2
+##
+{ 'struct': 'MigrationExecCommand',
+  'data': {'args': [ 'str' ] } }
+
+##
+# @MigrationAddress:
+#
+# Migration endpoint configuration.
+#
+# Since 8.2
+##
+{ 'union': 'MigrationAddress',
+  'base': { 'transport' : 'MigrationAddressType'},
+  'discriminator': 'transport',
+  'data': {
+'socket': 'SocketAddress',
+'exec': 'MigrationExecCommand',
+'rdma': 'InetSocketAddress' } }
+
 ##
 # @migrate:
 #
-- 
2.22.3

Re: How to tame CI?

2023-07-26 Thread Daniel P . Berrangé

On Wed, Jul 26, 2023 at 04:36:32PM +0200, Juan Quintela wrote:
> Daniel P. Berrangé  wrote:
> > On Wed, Jul 26, 2023 at 02:00:03PM +0100, Peter Maydell wrote:
> >> On Wed, 26 Jul 2023 at 13:06, Juan Quintela  wrote:
> >> > To make things easier, this is the part that show how it breaks (this is
> >> > the gcov test):
> >> >
> >> > 357/423 qemu:block / io-qcow2-copy-before-write  
> >> >   ERROR   6.38s   exit status 1
> >> > >>> PYTHON=/builds/juan.quintela/qemu/build/pyvenv/bin/python3
> >> > MALLOC_PERTURB_=44
> >> > /builds/juan.quintela/qemu/build/pyvenv/bin/python3
> >> > /builds/juan.quintela/qemu/build/../tests/qemu-iotests/check -tap
> >> > -qcow2 copy-before-write --source-dir
> >> > /builds/juan.quintela/qemu/tests/qemu-iotests --build-dir
> >> > /builds/juan.quintela/qemu/build/tests/qemu-iotests
> >> > ― ✀  
> >> > ―
> >> > stderr:
> >> > --- 
> >> > /builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write.out
> >> > +++ 
> >> > /builds/juan.quintela/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad
> >> > @@ -1,5 +1,21 @@
> >> > -
> >> > +...F
> >> > +==
> >> > +FAIL: test_timeout_break_snapshot (__main__.TestCbwError)
> >> > +--
> >> > +Traceback (most recent call last):
> >> > +  File 
> >> > "/builds/juan.quintela/qemu/tests/qemu-iotests/tests/copy-before-write", 
> >> > line 210, in test_timeout_break_snapshot
> >> > +self.assertEqual(log, """\
> >> > +AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at
> >> > offset 0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed:
> >> > Permission denied\n'
> >> > +  wrote 524288/524288 bytes at offset 0
> >> > +  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >> > +  wrote 524288/524288 bytes at offset 524288
> >> > +  512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >> > ++ read failed: Permission denied
> >> > +- read 1048576/1048576 bytes at offset 0
> >> > +- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> >> > +
> >> 
> >> This iotest failing is an intermittent that I've seen running
> >> pullreqs on master. I tend to see it on the s390 host. I
> >> suspect a race condition somewhere where it fails if the host
> >> is heavily loaded.
> 
> What is weird to me is that I was unable to reproduce it on the previous
> commit.  But with this one happened always.  No, I have no clue why, and
> as said, it makes zero sense, it is for a binary that it is not used on
> the block test.

Your commit changes the migration test, which could change the overall
tests running time, and thus impact what tests are running in parallel.
This could be enough to trigger the race more reliably.

With regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

[PATCH v10 08/10] migration: Implement MigrateChannelList to qmp migration flow.

2023-07-26 Thread Het Gala

Integrate MigrateChannelList with all transport backends
(socket, exec and rdma) for both src and dest migration
endpoints for qmp migration.

For current series, limit the size of MigrateChannelList
to single element (single interface) as runtime check.

Suggested-by: Aravind Retnakaran 
Signed-off-by: Het Gala 
---
 migration/migration.c | 95 +++
 1 file changed, 52 insertions(+), 43 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 5720c8ed4c..d54ecbb00d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -426,9 +426,10 @@ void migrate_add_address(SocketAddress *address)
 }
 
 static bool migrate_uri_parse(const char *uri,
-  MigrationAddress **channel,
+  MigrationChannel **channel,
   Error **errp)
 {
+g_autoptr(MigrationChannel) val = g_new0(MigrationChannel, 1);
 g_autoptr(MigrationAddress) addr = g_new0(MigrationAddress, 1);
 SocketAddress *saddr = >u.socket;
 InetSocketAddress *isock = >u.rdma;
@@ -465,7 +466,9 @@ static bool migrate_uri_parse(const char *uri,
 return false;
 }
 
-*channel = addr;
+val->channel_type = MIGRATION_CHANNEL_TYPE_MAIN;
+val->addr = addr;
+*channel = val;
 return true;
 }
 
@@ -473,41 +476,44 @@ static void qemu_start_incoming_migration(const char 
*uri, bool has_channels,
   MigrationChannelList *channels,
   Error **errp)
 {
-g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
+g_autoptr(MigrationChannel) channel = g_new0(MigrationChannel, 1);
+g_autoptr(MigrationAddress) addr = g_new0(MigrationAddress, 1);
 
 /*
  * Having preliminary checks for uri and channel
  */
-if (has_channels) {
-error_setg(errp, "'channels' argument should not be set yet.");
-return;
-}
-
 if (uri && has_channels) {
 error_setg(errp, "'uri' and 'channels' arguments are mutually "
"exclusive; exactly one of the two should be present in "
"'migrate-incoming' qmp command ");
 return;
-}
-
-if (!uri && !has_channels) {
+} else if (channels) {
+/* To verify that Migrate channel list has only item */
+if (channels->next) {
+error_setg(errp, "Channel list has more than one entries");
+return;
+}
+channel = channels->value;
+} else if (uri) {
+/* caller uses the old URI syntax */
+if (!migrate_uri_parse(uri, , errp)) {
+return;
+}
+} else {
 error_setg(errp, "neither 'uri' or 'channels' argument are "
"specified in 'migrate-incoming' qmp command ");
 return;
 }
-
-if (uri && !migrate_uri_parse(uri, , errp)) {
-return;
-}
+addr = channel->addr;
 
 /* transport mechanism not suitable for migration? */
-if (!migration_channels_and_transport_compatible(channel, errp)) {
+if (!migration_channels_and_transport_compatible(addr, errp)) {
 return;
 }
 
 qapi_event_send_migration(MIGRATION_STATUS_SETUP);
-if (channel->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
-SocketAddress *saddr = >u.socket;
+if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
+SocketAddress *saddr = >u.socket;
 if (saddr->type == SOCKET_ADDRESS_TYPE_INET ||
 saddr->type == SOCKET_ADDRESS_TYPE_UNIX ||
 saddr->type == SOCKET_ADDRESS_TYPE_VSOCK) {
@@ -516,11 +522,11 @@ static void qemu_start_incoming_migration(const char 
*uri, bool has_channels,
 fd_start_incoming_migration(saddr->u.fd.str, errp);
 }
 #ifdef CONFIG_RDMA
-} else if (channel->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
-rdma_start_incoming_migration(>u.rdma, errp);
-#endif
-} else if (channel->transport == MIGRATION_ADDRESS_TYPE_EXEC) {
-exec_start_incoming_migration(channel->u.exec.args, errp);
+} else if (addr->transport == MIGRATION_ADDRESS_TYPE_RDMA) {
+rdma_start_incoming_migration(>u.rdma, errp);
+ #endif
+} else if (addr->transport == MIGRATION_ADDRESS_TYPE_EXEC) {
+exec_start_incoming_migration(addr->u.exec.args, errp);
 } else {
 error_setg(errp, "unknown migration protocol: %s", uri);
 }
@@ -1720,35 +1726,38 @@ void qmp_migrate(const char *uri, bool has_channels,
 bool resume_requested;
 Error *local_err = NULL;
 MigrationState *s = migrate_get_current();
-g_autoptr(MigrationAddress) channel = g_new0(MigrationAddress, 1);
+g_autoptr(MigrationChannel) channel = g_new0(MigrationChannel, 1);
+g_autoptr(MigrationAddress) addr = g_new0(MigrationAddress, 1);
 
 /*
  * Having preliminary checks for uri and channel
  */
-if (has_channels) {
-error_setg(errp, "'channels'

[PATCH 43/44] Add missed BCM2835 properties

2023-07-26 Thread Sergey Kambalin

Signed-off-by: Sergey Kambalin 
---
 hw/misc/bcm2835_property.c| 170 ++
 include/hw/misc/raspberrypi-fw-defs.h |  11 ++
 2 files changed, 181 insertions(+)

diff --git a/hw/misc/bcm2835_property.c b/hw/misc/bcm2835_property.c
index 4ed9faa54a..7d2d6e518d 100644
--- a/hw/misc/bcm2835_property.c
+++ b/hw/misc/bcm2835_property.c
@@ -19,6 +19,31 @@
 #include "trace.h"
 #include "hw/arm/raspi_platform.h"
 
+#define RPI_EXP_GPIO_BASE   128
+#define VC4_GPIO_EXPANDER_COUNT 8
+#define VCHI_BUSADDR_SIZE   sizeof(uint32_t)
+
+struct vc4_display_settings_t {
+uint32_t display_num;
+uint32_t width;
+uint32_t height;
+uint32_t depth;
+uint16_t pitch;
+uint32_t virtual_width;
+uint32_t virtual_height;
+uint16_t virtual_width_offset;
+uint32_t virtual_height_offset;
+unsigned long fb_bus_address;
+} QEMU_PACKED;
+
+struct vc4_gpio_expander_t {
+uint32_t direction;
+uint32_t polarity;
+uint32_t term_en;
+uint32_t term_pull_up;
+uint32_t state;
+} vc4_gpio_expander[VC4_GPIO_EXPANDER_COUNT];
+
 /* https://github.com/raspberrypi/firmware/wiki/Mailbox-property-interface */
 
 static void bcm2835_property_mbox_push(BCM2835PropertyState *s, uint32_t value)
@@ -30,6 +55,7 @@ static void bcm2835_property_mbox_push(BCM2835PropertyState 
*s, uint32_t value)
 uint32_t tmp;
 int n;
 uint32_t offset, length, color;
+uint32_t gpio_num;
 
 /*
  * Copy the current state of the framebuffer config; we will update
@@ -138,6 +164,13 @@ static void 
bcm2835_property_mbox_push(BCM2835PropertyState *s, uint32_t value)
 resplen = 8;
 break;
 
+case RPI_FWREQ_GET_CLOCKS:
+/* TODO: add more clock IDs if needed */
+stl_le_phys(>dma_as, value + 12, 0);
+stl_le_phys(>dma_as, value + 16, RPI_FIRMWARE_ARM_CLK_ID);
+resplen = 8;
+break;
+
 case RPI_FWREQ_SET_CLOCK_RATE:
 case RPI_FWREQ_SET_MAX_CLOCK_RATE:
 case RPI_FWREQ_SET_MIN_CLOCK_RATE:
@@ -276,6 +309,7 @@ static void bcm2835_property_mbox_push(BCM2835PropertyState 
*s, uint32_t value)
 stl_le_phys(>dma_as, value + 12, 0);
 resplen = 4;
 break;
+
 case RPI_FWREQ_FRAMEBUFFER_GET_NUM_DISPLAYS:
 stl_le_phys(>dma_as, value + 12, 1);
 resplen = 4;
@@ -301,6 +335,142 @@ static void 
bcm2835_property_mbox_push(BCM2835PropertyState *s, uint32_t value)
 resplen);
 break;
 
+case RPI_FWREQ_GET_THROTTLED:
+stl_le_phys(>dma_as, value + 12, 0);
+resplen = 4;
+break;
+
+case RPI_FWREQ_FRAMEBUFFER_GET_DISPLAY_SETTINGS:
+stl_le_phys(>dma_as, value + 12, 0); /* display_num */
+stl_le_phys(>dma_as, value + 16, 800); /* width */
+stl_le_phys(>dma_as, value + 20, 600); /* height */
+stl_le_phys(>dma_as, value + 24, 32); /* depth */
+stl_le_phys(>dma_as, value + 28, 32); /* pitch */
+stl_le_phys(>dma_as, value + 30, 0); /* virtual_width */
+stl_le_phys(>dma_as, value + 34, 0); /* virtual_height */
+stl_le_phys(>dma_as, value + 38, 0); /* virtual_width_offset */
+stl_le_phys(>dma_as, value + 40, 0); /* virtual_height_offset */
+stl_le_phys(>dma_as, value + 44, 0); /* fb_bus_address low */
+stl_le_phys(>dma_as, value + 48, 0); /* fb_bus_address hi */
+resplen = sizeof(struct vc4_display_settings_t);
+break;
+
+case RPI_FWREQ_FRAMEBUFFER_SET_PITCH:
+resplen = 0;
+break;
+
+case RPI_FWREQ_GET_GPIO_CONFIG:
+if (ldl_le_phys(>dma_as, value + 12) < RPI_EXP_GPIO_BASE) {
+qemu_log_mask(LOG_UNIMP, "RPI_FWREQ_GET_GPIO_CONFIG "
+  "not implemented for gpiochip0\n");
+} else {
+gpio_num = ldl_le_phys(>dma_as, value + 12)
+   - RPI_EXP_GPIO_BASE;
+
+if (gpio_num < VC4_GPIO_EXPANDER_COUNT) {
+stl_le_phys(>dma_as, value + 16,
+vc4_gpio_expander[gpio_num].direction);
+stl_le_phys(>dma_as, value + 20,
+vc4_gpio_expander[gpio_num].polarity);
+stl_le_phys(>dma_as, value + 24,
+vc4_gpio_expander[gpio_num].term_en);
+stl_le_phys(>dma_as, value + 28,
+vc4_gpio_expander[gpio_num].term_pull_up);
+/* must be equal 0 */
+stl_le_phys(>dma_as, value + 12, 0);
+resplen = 4 * 5;
+} else {
+qemu_log_mask(LOG_GUEST_ERROR,
+  "RPI_FWREQ_GET_GPIO_CONFIG "
+  "gpio num must be

1 2 3 >

1 - 100 of 241 matches

Mail list logo