from:"Christian Borntraeger"

Re: [PATCH v1] KVM: s390x: selftests: Add shared zeropage test

2024-04-15 Thread Christian Borntraeger





Am 15.04.24 um 16:19 schrieb Alexander Gordeev:

On Fri, Apr 12, 2024 at 10:43:29AM +0200, David Hildenbrand wrote:
Hi David,

  tools/testing/selftests/kvm/Makefile  |   1 +
  .../kvm/s390x/shared_zeropage_test.c  | 110 ++
  2 files changed, 111 insertions(+)
  create mode 100644 tools/testing/selftests/kvm/s390x/shared_zeropage_test.c


Tested-by: Alexander Gordeev 

@Janosch, Christian,

I think this patch should go via s390 tree together with
https://lore.kernel.org/r/20240411161441.910170-3-da...@redhat.com

Do you agree?


Yes, thank you

Acked-by: Christian Borntraeger

Re: [PATCH] target/s390x: improve cpu compatibility check error message

2024-03-15 Thread Christian Borntraeger


Am 14.03.24 um 20:00 schrieb Claudio Fontana:

some users were confused by this message showing under TCG:

Selected CPU generation is too new. Maximum supported model
in the configuration: 'xyz'

Try to clarify that the maximum can depend on the accel by
adding also the current accelerator to the message as such:

Selected CPU generation is too new. Maximum supported model
in the accelerator 'tcg' configuration: 'xyz'

Signed-off-by: Claudio Fontana 


Independent on which message we end up with (see comments
in this mail thread), I agree that improving the
error message is helpful.

Acked-by: Christian Borntraeger 


---
  target/s390x/cpu_models.c | 11 ++-
  1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/target/s390x/cpu_models.c b/target/s390x/cpu_models.c
index 1a1c096122..0d6d8fc727 100644
--- a/target/s390x/cpu_models.c
+++ b/target/s390x/cpu_models.c
@@ -508,14 +508,14 @@ static void check_compatibility(const S390CPUModel 
*max_model,
  
  if (model->def->gen > max_model->def->gen) {

  error_setg(errp, "Selected CPU generation is too new. Maximum "
-   "supported model in the configuration: \'%s\'",
-   max_model->def->name);
+   "supported model in the accelerator \'%s\' configuration: 
\'%s\'",
+   current_accel_name(), max_model->def->name);
  return;
  } else if (model->def->gen == max_model->def->gen &&
 model->def->ec_ga > max_model->def->ec_ga) {
  error_setg(errp, "Selected CPU GA level is too new. Maximum "
-   "supported model in the configuration: \'%s\'",
-   max_model->def->name);
+   "supported model in the accelerator \'%s\' configuration: 
\'%s\'",
+   current_accel_name(), max_model->def->name);
  return;
  }
  
@@ -537,7 +537,8 @@ static void check_compatibility(const S390CPUModel *max_model,

  error_setg(errp, " ");
  s390_feat_bitmap_to_ascii(missing, errp, error_prepend_missing_feat);
  error_prepend(errp, "Some features requested in the CPU model are not "
-  "available in the configuration: ");
+  "available in the accelerator \'%s\' configuration: ",
+  current_accel_name());
  }
  
  S390CPUModel *get_max_cpu_model(Error **errp)

Re: [PATCH v3 2/2] KVM: s390: selftests: memop: add a simple AR test

2024-02-20 Thread Christian Borntraeger





Am 16.02.24 um 22:36 schrieb Eric Farman:

There is a selftest that checks for an (expected) error when an
invalid AR is specified, but not one that exercises the AR path.

Add a simple test that mirrors the vanilla write/read test while
providing an AR. An AR that contains zero will direct the CPU to
use the primary address space normally used anyway. AR[1] is
selected for this test because the host AR[1] is usually non-zero,
and KVM needs to correctly swap those values.

Signed-off-by: Eric Farman 


Acked-by: Christian Borntraeger

Re: [PATCH v3 1/2] KVM: s390: fix access register usage in ioctls

2024-02-20 Thread Christian Borntraeger





Am 16.02.24 um 22:36 schrieb Eric Farman:

The routine ar_translation() can be reached by both the instruction
intercept path (where the access registers had been loaded with the
guest register contents), and the MEM_OP ioctls (which hadn't).
Since this routine saves the current registers to vcpu->run,
this routine erroneously saves host registers into the guest space.

Introduce a boolean in the kvm_vcpu_arch struct to indicate whether
the registers contain guest contents. If they do (the instruction
intercept path), the save can be performed and the AR translation
is done just as it is today. If they don't (the MEM_OP path), the
AR can be read from vcpu->run without stashing the current contents.

Signed-off-by: Eric Farman 


Acked-by: Christian Borntraeger 
Reviewed-by: Christian Borntraeger 

---
  arch/s390/include/asm/kvm_host.h | 1 +
  arch/s390/kvm/gaccess.c  | 3 ++-
  arch/s390/kvm/kvm-s390.c | 3 +++
  3 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index 52664105a473..c86215eb4ca7 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -765,6 +765,7 @@ struct kvm_vcpu_arch {
__u64 cputm_start;
bool gs_enabled;
bool skey_enabled;
+   bool acrs_loaded;
struct kvm_s390_pv_vcpu pv;
union diag318_info diag318_info;
  };
diff --git a/arch/s390/kvm/gaccess.c b/arch/s390/kvm/gaccess.c
index 5bfcc50c1a68..b4c805092021 100644
--- a/arch/s390/kvm/gaccess.c
+++ b/arch/s390/kvm/gaccess.c
@@ -391,7 +391,8 @@ static int ar_translation(struct kvm_vcpu *vcpu, union asce 
*asce, u8 ar,
if (ar >= NUM_ACRS)
return -EINVAL;
  
-	save_access_regs(vcpu->run->s.regs.acrs);

+   if (vcpu->arch.acrs_loaded)
+   save_access_regs(vcpu->run->s.regs.acrs);
alet.val = vcpu->run->s.regs.acrs[ar];
  
  	if (ar == 0 || alet.val == 0) {

diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ea63ac769889..61092f0e0a66 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -3951,6 +3951,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
KVM_SYNC_ARCH0 |
KVM_SYNC_PFAULT |
KVM_SYNC_DIAG318;
+   vcpu->arch.acrs_loaded = false;
kvm_s390_set_prefix(vcpu, 0);
if (test_kvm_facility(vcpu->kvm, 64))
vcpu->run->kvm_valid_regs |= KVM_SYNC_RICCB;
@@ -4951,6 +4952,7 @@ static void sync_regs(struct kvm_vcpu *vcpu)
}
save_access_regs(vcpu->arch.host_acrs);
restore_access_regs(vcpu->run->s.regs.acrs);
+   vcpu->arch.acrs_loaded = true;
/* save host (userspace) fprs/vrs */
save_fpu_regs();
vcpu->arch.host_fpregs.fpc = current->thread.fpu.fpc;
@@ -5021,6 +5023,7 @@ static void store_regs(struct kvm_vcpu *vcpu)
kvm_run->s.regs.pfc = vcpu->arch.pfault_compare;
save_access_regs(vcpu->run->s.regs.acrs);
restore_access_regs(vcpu->arch.host_acrs);
+   vcpu->arch.acrs_loaded = false;
/* Save guest register state */
save_fpu_regs();
vcpu->run->s.regs.fpc = current->thread.fpu.fpc;

Re: [PULL v2 00/39] tcg patch queue

2024-02-07 Thread Christian Borntraeger





Am 06.02.24 um 22:24 schrieb Peter Maydell:
[..]

Christian: this is on the s390x machine we have. Does the
VM setup for that share IO or CPU with other VMs somehow?


Yes it does, this is a shared system. I will talk to the team if there is 
anything that we can do about this.

Re: [PULL 2/7] s390x: do a subsystem reset before the unprotect on reboot

2024-01-11 Thread Christian Borntraeger





Am 11.01.24 um 10:43 schrieb Cédric Le Goater:
[...]



On a side note, I am also seeing :


Michael?



[   73.989688] [ cut here ]
[   73.989696] unexpected non zero alert.mask 0x20
[   73.989748] WARNING: CPU: 9 PID: 4503 at arch/s390/kvm/interrupt.c:3214 
kvm_s390_gisa_destroy+0xd4/0xe8 [kvm]
[   73.989791] Modules linked in: vfio_pci vfio_pci_core irqbypass vhost_net 
vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT 
nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
nf_defrag_ipv4 nf_tables nfnetlink 8021q garp mrp rfkill sunrpc ext4 mbcache 
jbd2 vfio_ap zcrypt_cex4 vfio_ccw mdev vfio_iommu_type1 vfio drm fuse i2c_core 
drm_panel_orientation_quirks xfs libcrc32c dm_service_time mlx5_core sd_mod 
t10_pi ghash_s390 sg prng des_s390 libdes sha3_512_s390 sha3_256_s390 mlxfw tls 
scm_block psample eadm_sch qeth_l2 bridge stp llc dasd_eckd_mod zfcp qeth 
dasd_mod scsi_transport_fc ccwgroup qdio dm_multipath dm_mirror dm_region_hash 
dm_log dm_mod pkey zcrypt kvm aes_s390
[   73.989825] CPU: 9 PID: 4503 Comm: worker Kdump: loaded Not tainted 
6.7.0-clg-dirty #52
[   73.989827] Hardware name: IBM 3931 LA1 400 (LPAR)
[   73.989829] Krnl PSW : 0704c0018000 03ff7fcd2198 
(kvm_s390_gisa_destroy+0xd8/0xe8 [kvm])
[   73.989845]    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 
RI:0 EA:3
[   73.989847] Krnl GPRS: c000fffe 00070027 0023 
0007df4249c8
[   73.989849]    03800649b858 03800649b850 0007fcb9db00 

[   73.989851]    8ebae8c8 83a8c4f0 00b69900 
8ebac000
[   73.989853]    03ff903aef68 03800649bd98 03ff7fcd2194 
03800649b9f8
[   73.989859] Krnl Code: 03ff7fcd2188: c0224f88    larl    
%r2,03ff7fd1c098
   03ff7fcd218e: c0e5fffea360    brasl    
%r14,03ff7fca684e
  #03ff7fcd2194: af00    mc    0,0
  >03ff7fcd2198: e310b7680204    lg    
%r1,10088(%r11)
   03ff7fcd219e: a7f4ffae    brc    
15,03ff7fcd20fa
   03ff7fcd21a2: 0707    bcr    0,%r7
   03ff7fcd21a4: 0707    bcr    0,%r7
   03ff7fcd21a6: 0707    bcr    0,%r7
[   73.989929] Call Trace:
[   73.989931]  [<03ff7fcd2198>] kvm_s390_gisa_destroy+0xd8/0xe8 [kvm]
[   73.989946] ([<03ff7fcd2194>] kvm_s390_gisa_destroy+0xd4/0xe8 [kvm])
[   73.989960]  [<03ff7fcc1578>] kvm_arch_destroy_vm+0x50/0x118 [kvm]
[   73.989974]  [<03ff7fcb00a2>] kvm_destroy_vm+0x15a/0x260 [kvm]
[   73.989985]  [<03ff7fcb021e>] kvm_vm_release+0x36/0x48 [kvm]
[   73.989996]  [<0007de4f830c>] __fput+0x94/0x2d0
[   73.990009]  [<0007de20d838>] task_work_run+0x88/0xe8
[   73.990013]  [<0007de1e75e0>] do_exit+0x2e0/0x4e0
[   73.990016]  [<0007de1e79c0>] do_group_exit+0x40/0xb8
[   73.990017]  [<0007de1f96e8>] send_sig_info+0x0/0xa8
[   73.990021]  [<0007de194b26>] arch_do_signal_or_restart+0x56/0x318
[   73.990025]  [<0007de28bf12>] exit_to_user_mode_prepare+0x10a/0x1a0
[   73.990028]  [<0007deb607d2>] __do_syscall+0x152/0x1f8
[   73.990032]  [<0007deb70ac8>] system_call+0x70/0x98
[   73.990036] Last Breaking-Event-Address:
[   73.990037]  [<0007de1e0c58>] __warn_printk+0x78/0xe8

Re: [PATCH 3/4] hw/s390x/ipl: Remove unused 'exec/exec-all.h' included header

2023-12-13 Thread Christian Borntraeger


Am 12.12.23 um 16:28 schrieb Eric Farman:

So why do you think exec-all.h is unused?




I think because that got moved out of exec-all.h a few months ago, via

commit 3549118b498873c84b442bc280a5edafbb61e0a4
Author: Philippe Mathieu-Daudé 
Date:   Thu Sep 14 20:57:08 2023 +0200

 exec: Move cpu_loop_foo() target agnostic functions to 'cpu-
common.h'
 
 While these functions are not TCG specific, they are not target

 specific. Move them to "exec/cpu-common.h" so their callers don't
 have to be tainted as target specific.



Ah right, I was looking at an old QEMU version

Re: [PATCH 3/4] hw/s390x/ipl: Remove unused 'exec/exec-all.h' included header

2023-12-12 Thread Christian Borntraeger





Am 12.12.23 um 12:36 schrieb Philippe Mathieu-Daudé:

Signed-off-by: Philippe Mathieu-Daudé 
---
  hw/s390x/ipl.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 515dcf51b5..62182d81a0 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -35,7 +35,6 @@
  #include "qemu/cutils.h"
  #include "qemu/option.h"
  #include "standard-headers/linux/virtio_ids.h"
-#include "exec/exec-all.h"


Philippe,

This include came with
commit a30fb811cbe940020a498d2cdac9326cac38b4d9
Author: David Hildenbrand 
AuthorDate: Tue Apr 24 12:18:59 2018 +0200
Commit: Cornelia Huck 
CommitDate: Mon May 14 17:10:02 2018 +0200

s390x: refactor reset/reipl handling

And I think one reason was

cpu_loop_exit

This is still part of ipl.c

a30fb811cbe (David Hildenbrand  2018-04-24 12:18:59 +0200 664) /* as 
this is triggered by a CPU, make sure to exit the loop */
a30fb811cbe (David Hildenbrand  2018-04-24 12:18:59 +0200 665) if 
(tcg_enabled()) {
a30fb811cbe (David Hildenbrand  2018-04-24 12:18:59 +0200 666) 
cpu_loop_exit(cs);
a30fb811cbe (David Hildenbrand  2018-04-24 12:18:59 +0200 667) }

So why do you think exec-all.h is unused?

Re: [PATCH 00/32] kmsan: Enable on s390

2023-11-16 Thread Christian Borntraeger





Am 16.11.23 um 11:13 schrieb Ilya Leoshkevich:

It's also possible to get a free s390 machine at [1].

[1] https://linuxone.cloud.marist.edu/oss


I think the URL for registration is this one
https://linuxone.cloud.marist.edu/#/register?flag=VM

Re: [PATCH v2 06/13] migration: Use VMSTATE_INSTANCE_ID_ANY for s390 devices

2023-10-20 Thread Christian Borntraeger


Am 20.10.23 um 11:07 schrieb Juan Quintela:

Just with make check I can see that we can have more than one of this
devices, so use ANY.

ok 5 /s390x/device/introspect/abstract-interfaces
...
Broken pipe
../../../../../mnt/code/qemu/full/tests/qtest/libqtest.c:195: kill_qemu() tried 
to terminate QEMU process but encountered exit status 1 (expected 0)
Aborted (core dumped)

Reviewed-by: Stefan Berger 
Signed-off-by: Juan Quintela 
---
  hw/s390x/s390-skeys.c| 3 ++-
  hw/s390x/s390-stattrib.c | 3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)


Actually both devices should be theŕe only once - I think.



diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
index 5024faf411..ef089e1967 100644
--- a/hw/s390x/s390-skeys.c
+++ b/hw/s390x/s390-skeys.c
@@ -22,6 +22,7 @@
  #include "sysemu/kvm.h"
  #include "migration/qemu-file-types.h"
  #include "migration/register.h"
+#include "migration/vmstate.h"
  
  #define S390_SKEYS_BUFFER_SIZE (128 * KiB)  /* Room for 128k storage keys */

  #define S390_SKEYS_SAVE_FLAG_EOS 0x01
@@ -457,7 +458,7 @@ static inline void s390_skeys_set_migration_enabled(Object 
*obj, bool value,
  ss->migration_enabled = value;
  
  if (ss->migration_enabled) {

-register_savevm_live(TYPE_S390_SKEYS, 0, 1,
+register_savevm_live(TYPE_S390_SKEYS, VMSTATE_INSTANCE_ID_ANY, 1,
   _s390_storage_keys, ss);
  } else {
  unregister_savevm(VMSTATE_IF(ss), TYPE_S390_SKEYS, ss);
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index 220e845d12..055d382c3c 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -13,6 +13,7 @@
  #include "qemu/units.h"
  #include "migration/qemu-file.h"
  #include "migration/register.h"
+#include "migration/vmstate.h"
  #include "hw/s390x/storage-attributes.h"
  #include "qemu/error-report.h"
  #include "exec/ram_addr.h"
@@ -380,7 +381,7 @@ static void s390_stattrib_instance_init(Object *obj)
  {
  S390StAttribState *sas = S390_STATTRIB(obj);
  
-register_savevm_live(TYPE_S390_STATTRIB, 0, 0,

+register_savevm_live(TYPE_S390_STATTRIB, VMSTATE_INSTANCE_ID_ANY, 0,
   _s390_stattrib_handlers, sas);
  
  object_property_add_bool(obj, "migration-enabled",

Re: [PATCH v2 06/13] migration: Use VMSTATE_INSTANCE_ID_ANY for s390 devices

2023-10-20 Thread Christian Borntraeger


Am 20.10.23 um 11:07 schrieb Juan Quintela:

Just with make check I can see that we can have more than one of this
devices, so use ANY.

ok 5 /s390x/device/introspect/abstract-interfaces
...
Broken pipe
../../../../../mnt/code/qemu/full/tests/qtest/libqtest.c:195: kill_qemu() tried 
to terminate QEMU process but encountered exit status 1 (expected 0)
Aborted (core dumped)

Reviewed-by: Stefan Berger 
Signed-off-by: Juan Quintela 
---
  hw/s390x/s390-skeys.c| 3 ++-
  hw/s390x/s390-stattrib.c | 3 ++-
  2 files changed, 4 insertions(+), 2 deletions(-)


Actually both devices should be theŕe only once - I think.



diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
index 5024faf411..ef089e1967 100644
--- a/hw/s390x/s390-skeys.c
+++ b/hw/s390x/s390-skeys.c
@@ -22,6 +22,7 @@
  #include "sysemu/kvm.h"
  #include "migration/qemu-file-types.h"
  #include "migration/register.h"
+#include "migration/vmstate.h"
  
  #define S390_SKEYS_BUFFER_SIZE (128 * KiB)  /* Room for 128k storage keys */

  #define S390_SKEYS_SAVE_FLAG_EOS 0x01
@@ -457,7 +458,7 @@ static inline void s390_skeys_set_migration_enabled(Object 
*obj, bool value,
  ss->migration_enabled = value;
  
  if (ss->migration_enabled) {

-register_savevm_live(TYPE_S390_SKEYS, 0, 1,
+register_savevm_live(TYPE_S390_SKEYS, VMSTATE_INSTANCE_ID_ANY, 1,
   _s390_storage_keys, ss);
  } else {
  unregister_savevm(VMSTATE_IF(ss), TYPE_S390_SKEYS, ss);
diff --git a/hw/s390x/s390-stattrib.c b/hw/s390x/s390-stattrib.c
index 220e845d12..055d382c3c 100644
--- a/hw/s390x/s390-stattrib.c
+++ b/hw/s390x/s390-stattrib.c
@@ -13,6 +13,7 @@
  #include "qemu/units.h"
  #include "migration/qemu-file.h"
  #include "migration/register.h"
+#include "migration/vmstate.h"
  #include "hw/s390x/storage-attributes.h"
  #include "qemu/error-report.h"
  #include "exec/ram_addr.h"
@@ -380,7 +381,7 @@ static void s390_stattrib_instance_init(Object *obj)
  {
  S390StAttribState *sas = S390_STATTRIB(obj);
  
-register_savevm_live(TYPE_S390_STATTRIB, 0, 0,

+register_savevm_live(TYPE_S390_STATTRIB, VMSTATE_INSTANCE_ID_ANY, 0,
   _s390_stattrib_handlers, sas);
  
  object_property_add_bool(obj, "migration-enabled",

Re: [PATCH v2 2/2] target/s390x/kvm: Simplify the GPRs, ACRs, CRs and prefix synchronization code

2023-10-10 Thread Christian Borntraeger





Am 09.10.23 um 19:07 schrieb Thomas Huth:
[...]


@@ -483,20 +482,14 @@ int kvm_arch_put_registers(CPUState *cs, int level)
  cs->kvm_run->psw_addr = env->psw.addr;
  cs->kvm_run->psw_mask = env->psw.mask;
  
-if (can_sync_regs(cs, KVM_SYNC_GPRS)) {

-for (i = 0; i < 16; i++) {
-cs->kvm_run->s.regs.gprs[i] = env->regs[i];
-cs->kvm_run->kvm_dirty_regs |= KVM_SYNC_GPRS;
-}
-} else {
-for (i = 0; i < 16; i++) {
-regs.gprs[i] = env->regs[i];
-}
-r = kvm_vcpu_ioctl(cs, KVM_SET_REGS, );
-if (r < 0) {
-return r;
-}
-}
+g_assert((cs->kvm_run->kvm_valid_regs & KVM_SYNC_REQUIRED_BITS) ==
+ KVM_SYNC_REQUIRED_BITS);
+cs->kvm_run->kvm_dirty_regs |= KVM_SYNC_REQUIRED_BITS;
+memcpy(cs->kvm_run->s.regs.gprs, env->regs, 
sizeof(cs->kvm_run->s.regs.gprs));




+memcpy(cs->kvm_run->s.regs.acrs, env->aregs, 
sizeof(cs->kvm_run->s.regs.acrs));
+memcpy(cs->kvm_run->s.regs.crs, env->cregs, 
sizeof(cs->kvm_run->s.regs.crs));
+
+cs->kvm_run->s.regs.prefix = env->psa;


These 3 have only been saved for level> KVM_PUT_RUNTIME_STATE. By moving the 
memcpy
into this position you would always write them. Probably ok but a change in
behaviour. Was this intentional?

Re: [PATCH v2 1/2] target/s390x/kvm: Turn KVM_CAP_SYNC_REGS into a hard requirement

2023-10-10 Thread Christian Borntraeger





Am 10.10.23 um 13:12 schrieb Thomas Huth:

On 10/10/2023 13.02, Christian Borntraeger wrote:



Am 09.10.23 um 19:07 schrieb Thomas Huth:

Since we already require at least kernel 3.15 in the s390x KVM code,
we can assume that the KVM_CAP_SYNC_REGS capability is always there.
Thus turn this into a hard requirement now.

Signed-off-by: Thomas Huth 
---
  target/s390x/kvm/kvm.c | 20 ++--
  1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index bc5c56a305..b3e2eaa2eb 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -337,21 +337,29 @@ int kvm_arch_get_default_type(MachineState *ms)
  int kvm_arch_init(MachineState *ms, KVMState *s)
  {
+    int required_caps[] = {
+    KVM_CAP_DEVICE_CTRL,
+    KVM_CAP_SYNC_REGS,
+    };
+
+    for (int i = 0; i < ARRAY_SIZE(required_caps); i++) {
+    if (!kvm_check_extension(s, required_caps[i])) {
+    error_report("KVM is missing capability #%d - "
+ "please use kernel 3.15 or newer", required_caps[i]);
+    return -1;
+    }
+    }
+
  object_class_foreach(ccw_machine_class_foreach, TYPE_S390_CCW_MACHINE,
   false, NULL);
-    if (!kvm_check_extension(kvm_state, KVM_CAP_DEVICE_CTRL)) {
-    error_report("KVM is missing capability KVM_CAP_DEVICE_CTRL - "
- "please use kernel 3.15 or newer");
-    return -1;
-    }
  if (!kvm_check_extension(s, KVM_CAP_S390_COW)) {
  error_report("KVM is missing capability KVM_CAP_S390_COW - "
   "unsupported environment");
  return -1;
  }


Not sure if we also want to move KVM_CAP_S390_COW somehow. The message would be 
different.


IIRC that error could happen when you ran KVM within an older version of z/VM, so the 
"please use kernel 3.15 or newer" message would be completely misleading there.


Yes, thats what I was trying to say, we would need a different message.
Lets go with this patch.



Aparch from that:
Reviewed-by: Christian Borntraeger 


Thanks,
   Thomas

Re: [PATCH v2 1/2] target/s390x/kvm: Turn KVM_CAP_SYNC_REGS into a hard requirement

2023-10-10 Thread Christian Borntraeger





Am 09.10.23 um 19:07 schrieb Thomas Huth:

Since we already require at least kernel 3.15 in the s390x KVM code,
we can assume that the KVM_CAP_SYNC_REGS capability is always there.
Thus turn this into a hard requirement now.

Signed-off-by: Thomas Huth 
---
  target/s390x/kvm/kvm.c | 20 ++--
  1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/target/s390x/kvm/kvm.c b/target/s390x/kvm/kvm.c
index bc5c56a305..b3e2eaa2eb 100644
--- a/target/s390x/kvm/kvm.c
+++ b/target/s390x/kvm/kvm.c
@@ -337,21 +337,29 @@ int kvm_arch_get_default_type(MachineState *ms)
  
  int kvm_arch_init(MachineState *ms, KVMState *s)

  {
+int required_caps[] = {
+KVM_CAP_DEVICE_CTRL,
+KVM_CAP_SYNC_REGS,
+};
+
+for (int i = 0; i < ARRAY_SIZE(required_caps); i++) {
+if (!kvm_check_extension(s, required_caps[i])) {
+error_report("KVM is missing capability #%d - "
+ "please use kernel 3.15 or newer", required_caps[i]);
+return -1;
+}
+}
+
  object_class_foreach(ccw_machine_class_foreach, TYPE_S390_CCW_MACHINE,
   false, NULL);
  
-if (!kvm_check_extension(kvm_state, KVM_CAP_DEVICE_CTRL)) {

-error_report("KVM is missing capability KVM_CAP_DEVICE_CTRL - "
- "please use kernel 3.15 or newer");
-return -1;
-}
  if (!kvm_check_extension(s, KVM_CAP_S390_COW)) {
  error_report("KVM is missing capability KVM_CAP_S390_COW - "
   "unsupported environment");
  return -1;
  }


Not sure if we also want to move KVM_CAP_S390_COW somehow. The message would be 
different.
Aparch from that:
Reviewed-by: Christian Borntraeger 


  
-cap_sync_regs = kvm_check_extension(s, KVM_CAP_SYNC_REGS);

+cap_sync_regs = true;
  cap_async_pf = kvm_check_extension(s, KVM_CAP_ASYNC_PF);
  cap_mem_op = kvm_check_extension(s, KVM_CAP_S390_MEM_OP);
  cap_mem_op_extension = kvm_check_extension(s, 
KVM_CAP_S390_MEM_OP_EXTENSION);

Re: [PATCH] s390x: do a subsystem reset before the unprotect on reboot

2023-09-06 Thread Christian Borntraeger





Am 01.09.23 um 13:48 schrieb Janosch Frank:

Bound APQNs have to be reset before tearing down the secure config via
s390_machine_unprotect(). Otherwise the Ultravisor will return a error
code.

So let's do a subsystem_reset() which includes a AP reset before the
unprotect call. We'll do a full device_reset() afterwards which will
reset some devices twice. That's ok since we can't move the
device_reset() before the unprotect as it includes a CPU clear reset
which the Ultravisor does not expect at that point in time.

Signed-off-by: Janosch Frank 


Makes sense.
Acked-by: Christian Borntraeger 


---

I'm not able to test this for the PV AP case right new, that has to
wait to early next week. However Marc told me that the non-AP PV test
works fine now.

---
  hw/s390x/s390-virtio-ccw.c | 10 ++
  1 file changed, 10 insertions(+)

diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 3dd0b2372d..2d75f2131f 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -438,10 +438,20 @@ static void s390_machine_reset(MachineState *machine, 
ShutdownCause reason)
  switch (reset_type) {
  case S390_RESET_EXTERNAL:
  case S390_RESET_REIPL:
+/*
+ * Reset the subsystem which includes a AP reset. If a PV
+ * guest had APQNs attached the AP reset is a prerequisite to
+ * unprotecting since the UV checks if all APQNs are reset.
+ */
+subsystem_reset();
  if (s390_is_pv()) {
  s390_machine_unprotect(ms);
  }
  
+/*

+ * Device reset includes CPU clear resets so this has to be
+ * done AFTER the unprotect call above.
+ */
  qemu_devices_reset(reason);
  s390_crypto_reset();

Re: SLES15SP4 IPL failure: stage2 boot loader

2023-08-04 Thread Christian Borntraeger


Am 03.08.23 um 18:31 schrieb Clovis Pereira:

Looks like a problem with the host (KVM) not with the guest (SLES).


Right. There was at least one bug in QEMU where in (seldom) cases the boot 
loader failed:

https://gitlab.com/qemu-project/qemu/-/commit/5f97ba0c74ccace0a4014460de9751ff3c6f454a

And maybe the guest update to SP4 just triggered that corner case and the 
update to SP5 fixed it.
In any case qemu 5.2 from SLES15SP3 (or newer) should contain that fix so if 
you see that again,
updating the host should fix it.


Christian


Clovis Pereira
IT Mainframe Specialist - SW services
TLS - IBM Technology Lifecycle Services
55 11 99925-6242 (Mobile)
gclo...@br.ibm.com

IBM

De: Linux on 390 Port  em nome de Alan Haff 

Enviado: quinta-feira, 3 de agosto de 2023 12:31
Para: LINUX-390@VM.MARIST.EDU 
Assunto: [EXTERNAL] Re: SLES15SP4 IPL failure: stage2 boot loader


What QEMU do you have in the host?  (rpm -qa | grep qemu). Is that up2date?


qemu-s390-4.2.1-11.16.3.s390x

I know, I'm pretty far backlevel. Strange that it worked ok with SP3, though.

The good news is that I've been able to work around the problem for now by 
extracting the kernel and initrd from the guest disk and storing it on the host 
to IPL the guest with a direct kernel boot.

-Original Message-
From: Linux on 390 Port On Behalf Of Christian Borntraeger
Sent: Thursday, August 3, 2023 03:08
To: LINUX-390@VM.MARIST.EDU
Subject: [EXTERNAL] - Re: SLES15SP4 IPL failure: stage2 boot loader

Am 02.08.23 um 21:11 schrieb Alan Haff:

I have a SLES15 SP4 system running under KVM on a z13s that refuses to IPL after 
upgrading from SP3. The failure message is "! Cannot read stage2 boot loader 
!". I've IPLed into a rescue shell from the SP4 ISO, set up a chroot environment, 
and re-ran grub2-install with no obvious errors. Still no joy. Not sure what to try next.

Any ideas/suggestions/assistance would be greatly appreciated.


This looks like an error of the QEMU loader not being able to find the right 
things on the disk.
What QEMU do you have in the host?  (rpm -qa | grep qemu). Is that up2date?

--
For LINUX-390 subscribe / signoff / archive access instructions, send email to 
lists...@vm.marist.edu with the message: INFO LINUX-390 or visit 
http://www2.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

Re: SLES15SP4 IPL failure: stage2 boot loader

2023-08-03 Thread Christian Borntraeger


Am 02.08.23 um 21:11 schrieb Alan Haff:

I have a SLES15 SP4 system running under KVM on a z13s that refuses to IPL after 
upgrading from SP3. The failure message is "! Cannot read stage2 boot loader 
!". I've IPLed into a rescue shell from the SP4 ISO, set up a chroot environment, 
and re-ran grub2-install with no obvious errors. Still no joy. Not sure what to try next.

Any ideas/suggestions/assistance would be greatly appreciated.


This looks like an error of the QEMU loader not being able to find the right 
things on the disk.
What QEMU do you have in the host?  (rpm -qa | grep qemu). Is that up2date?

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

Re: s390 intermittent test failure in qemu:block / io-qcow2-copy-before-write

2023-07-26 Thread Christian Borntraeger





Am 25.07.23 um 18:45 schrieb Peter Maydell:

There seems to be an intermittent failure on the s390 host in
the qemu:block / io-qcow2-copy-before-write test:
https://gitlab.com/qemu-project/qemu/-/jobs/4737819873

The log says the test was expecting to do some reading
and writing but got an unexpected 'permission denied'
error on the read. Any idea why this might happen ?

768/835 qemu:block / io-qcow2-copy-before-write ERROR 12.05s exit status 1

PYTHON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3
 MALLOC_PERTURB_=101 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3
 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/../tests/qemu-iotests/check
 -tap -qcow2 copy-before-write --source-dir 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests 
--build-dir 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qemu-iotests

― ✀ ―
stderr:
--- 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests/tests/copy-before-write.out
+++ 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad
@@ -1,5 +1,21 @@
-
+...F
+==
+FAIL: test_timeout_break_snapshot (__main__.TestCbwError)
+--
+Traceback (most recent call last):
+ File 
"/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests/tests/copy-before-write",
line 210, in test_timeout_break_snapshot
+ self.assertEqual(log, """\
+AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at offset
0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: Permission
denied\n'
+ wrote 524288/524288 bytes at offset 0
+ 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+ wrote 524288/524288 bytes at offset 524288
+ 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
++ read failed: Permission denied
+- read 1048576/1048576 bytes at offset 0
+- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+
--
Ran 4 tests
-OK
+FAILED (failures=1)
(test program exited with status code 1)
――

Same failure, previous job:

https://gitlab.com/qemu-project/qemu/-/jobs/4736463062

This one's a "Failed to get write lock" in io-qcow2-161:

https://gitlab.com/qemu-project/qemu/-/jobs/4734846533


See
https://lore.kernel.org/qemu-devel/028059df-eaf4-9e65-a195-4733b708a...@linux.ibm.com/

I was not yet able to understand that. And I can only reproduce with Ubuntu 
(RHEL,SLES,Fedora are fine).
Upstream kernel on Ubuntu does not help, so I assume it must be something in 
the libraries or config.

Re: s390 intermittent test failure in qemu:block / io-qcow2-copy-before-write

2023-07-26 Thread Christian Borntraeger





Am 25.07.23 um 18:45 schrieb Peter Maydell:

There seems to be an intermittent failure on the s390 host in
the qemu:block / io-qcow2-copy-before-write test:
https://gitlab.com/qemu-project/qemu/-/jobs/4737819873

The log says the test was expecting to do some reading
and writing but got an unexpected 'permission denied'
error on the read. Any idea why this might happen ?

768/835 qemu:block / io-qcow2-copy-before-write ERROR 12.05s exit status 1

PYTHON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3
 MALLOC_PERTURB_=101 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/pyvenv/bin/python3
 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/../tests/qemu-iotests/check
 -tap -qcow2 copy-before-write --source-dir 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests 
--build-dir 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qemu-iotests

― ✀ ―
stderr:
--- 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests/tests/copy-before-write.out
+++ 
/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/scratch/qcow2-file-copy-before-write/copy-before-write.out.bad
@@ -1,5 +1,21 @@
-
+...F
+==
+FAIL: test_timeout_break_snapshot (__main__.TestCbwError)
+--
+Traceback (most recent call last):
+ File 
"/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/qemu-iotests/tests/copy-before-write",
line 210, in test_timeout_break_snapshot
+ self.assertEqual(log, """\
+AssertionError: 'wrot[195 chars]read 1048576/1048576 bytes at offset
0\n1 MiB,[46 chars]c)\n' != 'wrot[195 chars]read failed: Permission
denied\n'
+ wrote 524288/524288 bytes at offset 0
+ 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+ wrote 524288/524288 bytes at offset 524288
+ 512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
++ read failed: Permission denied
+- read 1048576/1048576 bytes at offset 0
+- 1 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+
--
Ran 4 tests
-OK
+FAILED (failures=1)
(test program exited with status code 1)
――

Same failure, previous job:

https://gitlab.com/qemu-project/qemu/-/jobs/4736463062

This one's a "Failed to get write lock" in io-qcow2-161:

https://gitlab.com/qemu-project/qemu/-/jobs/4734846533


See
https://lore.kernel.org/qemu-devel/028059df-eaf4-9e65-a195-4733b708a...@linux.ibm.com/

I was not yet able to understand that. And I can only reproduce with Ubuntu 
(RHEL,SLES,Fedora are fine).
Upstream kernel on Ubuntu does not help, so I assume it must be something in 
the libraries or config.

Re: [PATCH] pc-bios/s390-ccw: Get rid of the the __u* types

2023-06-28 Thread Christian Borntraeger

Am 28.06.23 um 10:32 schrieb Thomas Huth:

On 27/06/2023 14.19, Christian Borntraeger wrote:

Am 27.06.23 um 13:41 schrieb Thomas Huth:

Using types starting with double underscores should be avoided since these
names are marked as reserved by the C standard. The corresponding Linux

In general I think this change is fine, but this is kind of interesting, as
/usr/include/linux/types.h does have __u64 and friends. In fact there is
__u64 but not u64 in /usr/include.

And yes a google search for double underscore has

The use of two underscores (` __ ') in identifiers is reserved for the
compiler's internal use according to the ANSI-C standard. Underscores
(` _ ') are often used in names of library functions (such as " _main
" and " _exit "). In order to avoid collisions, do not begin an
identifier with an underscore.

kernel header file has also been changed accordingly a long time ago:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/s390/cio/cio.h?id=cd6b4f27b9bb2a

but IIRC from a kernel perspective u64 is for kernel internal uint64_t
and __u64 is for uapi, e.g. see
https://lkml.indiana.edu/hypermail/linux/kernel/1401.2/02851.html

So in essence we (QEMU/s390-ccw) have to decide what to use for our
internal purposes. And yes, u64 and this patch is certainly ok. But
we might need to change the patch description

Ok, agreed, the patch description could be better. Maybe just something like
this:

"
The types starting with double underscores have likely been introduced into the
s390-ccw bios to be able to re-use structs from the Linux kernel in the past,
but the corresponding structs in cio.h have been changed there a long time ago
already to not use the variants with the double underscores anymore:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/s390/cio/cio.h?id=cd6b4f27b9bb2a

So it would be good to replace these in the s390-ccw bios now, too.

Yes, looks good.

Re: [PATCH] pc-bios/s390-ccw: Get rid of the the __u* types

2023-06-27 Thread Christian Borntraeger





Am 27.06.23 um 13:41 schrieb Thomas Huth:

Using types starting with double underscores should be avoided since these
names are marked as reserved by the C standard. The corresponding Linux


In general I think this change is fine, but this is kind of interesting, as
/usr/include/linux/types.h does have __u64 and friends. In fact there is
__u64 but not u64 in /usr/include.

And yes a google search for double underscore  has

The use of two underscores (` __ ') in identifiers is reserved for the
compiler's internal use according to the ANSI-C standard. Underscores
(` _ ') are often used in names of library functions (such as " _main
" and " _exit "). In order to avoid collisions, do not begin an
identifier with an underscore.


kernel header file has also been changed accordingly a long time ago:

  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/s390/cio/cio.h?id=cd6b4f27b9bb2a


but IIRC from a kernel perspective u64 is for kernel internal uint64_t
and __u64 is for uapi, e.g. see
https://lkml.indiana.edu/hypermail/linux/kernel/1401.2/02851.html

So in essence we (QEMU/s390-ccw) have to decide what to use for our
internal purposes. And yes, u64 and this patch is certainly ok. But
we might need to change the patch description



So we should get rid of the __u* in the s390-ccw bios now finally, too.

Signed-off-by: Thomas Huth 
---
  Based-on: <20230510143925.4094-4-quint...@redhat.com>

  pc-bios/s390-ccw/cio.h  | 232 ++--
  pc-bios/s390-ccw/s390-ccw.h |   4 -
  2 files changed, 116 insertions(+), 120 deletions(-)

diff --git a/pc-bios/s390-ccw/s390-ccw.h b/pc-bios/s390-ccw/s390-ccw.h
index efeb449572..c977a52b50 100644
--- a/pc-bios/s390-ccw/s390-ccw.h
+++ b/pc-bios/s390-ccw/s390-ccw.h
@@ -17,10 +17,6 @@ typedef unsigned char  u8;
  typedef unsigned short u16;
  typedef unsigned int   u32;
  typedef unsigned long long u64;
-typedef unsigned char  __u8;
-typedef unsigned short __u16;
-typedef unsigned int   __u32;
-typedef unsigned long long __u64;
  
  #define true 1

  #define false 0
diff --git a/pc-bios/s390-ccw/cio.h b/pc-bios/s390-ccw/cio.h
index 88a88adfd2..8b18153deb 100644
--- a/pc-bios/s390-ccw/cio.h
+++ b/pc-bios/s390-ccw/cio.h
@@ -17,32 +17,32 @@
   * path management control word
   */
  struct pmcw {
-__u32 intparm;  /* interruption parameter */
-__u32 qf:1; /* qdio facility */
-__u32 w:1;
-__u32 isc:3;/* interruption subclass */
-__u32 res5:3;   /* reserved zeros */
-__u32 ena:1;/* enabled */
-__u32 lm:2; /* limit mode */
-__u32 mme:2;/* measurement-mode enable */
-__u32 mp:1; /* multipath mode */
-__u32 tf:1; /* timing facility */
-__u32 dnv:1;/* device number valid */
-__u32 dev:16;   /* device number */
-__u8  lpm;  /* logical path mask */
-__u8  pnom; /* path not operational mask */
-__u8  lpum; /* last path used mask */
-__u8  pim;  /* path installed mask */
-__u16 mbi;  /* measurement-block index */
-__u8  pom;  /* path operational mask */
-__u8  pam;  /* path available mask */
-__u8  chpid[8]; /* CHPID 0-7 (if available) */
-__u32 unused1:8;/* reserved zeros */
-__u32 st:3; /* subchannel type */
-__u32 unused2:18;   /* reserved zeros */
-__u32 mbfc:1;   /* measurement block format control */
-__u32 xmwme:1;  /* extended measurement word mode enable */
-__u32 csense:1; /* concurrent sense; can be enabled ...*/
+u32 intparm;/* interruption parameter */
+u32 qf:1;   /* qdio facility */
+u32 w:1;
+u32 isc:3;  /* interruption subclass */
+u32 res5:3; /* reserved zeros */
+u32 ena:1;  /* enabled */
+u32 lm:2;   /* limit mode */
+u32 mme:2;  /* measurement-mode enable */
+u32 mp:1;   /* multipath mode */
+u32 tf:1;   /* timing facility */
+u32 dnv:1;  /* device number valid */
+u32 dev:16; /* device number */
+u8  lpm;/* logical path mask */
+u8  pnom;   /* path not operational mask */
+u8  lpum;   /* last path used mask */
+u8  pim;/* path installed mask */
+u16 mbi;/* measurement-block index */
+u8  pom;/* path operational mask */
+u8  pam;/* path available mask */
+u8  chpid[8];   /* CHPID 0-7 (if available) */
+u32 unused1:8;  /* reserved zeros */
+u32 st:3;   /* subchannel type */
+u32 unused2:18; /* reserved zeros */
+u32 mbfc:1; /* measurement block format control */
+u32 xmwme:1;/* extended measurement word mode enable */
+u32 csense:1;   /* concurrent sense; can be enabled ...*/
  /*  ... per MSCH,

Re: [PATCH 3/4] pc-bios/s390-ccw: Move the stack array into start.S

2023-06-26 Thread Christian Borntraeger




Am 26.06.23 um 15:21 schrieb Thomas Huth:


diff --git a/pc-bios/s390-ccw/start.S b/pc-bios/s390-ccw/start.S
index 29b0a9ece0..47ef6e8aa8 100644
--- a/pc-bios/s390-ccw/start.S
+++ b/pc-bios/s390-ccw/start.S
@@ -120,3 +120,8 @@ external_new_mask:
  .quad   0x00018000
  io_new_mask:
  .quad   0x00018000
+
+.bss
+
+.align  16
+.lcomm  stack,STACK_SIZE


IIRC, the ELF ABI defines the stack to be 8 byte aligned, but 16 certainly does 
not hurt.

Acked-by: Christian Borntraeger

Re: [PATCH 2/4] pc-bios/s390-ccw: Provide space for initial stack frame in start.S

2023-06-26 Thread Christian Borntraeger


Am 26.06.23 um 15:21 schrieb Thomas Huth:

Providing the space of a stack frame is the duty of the caller,
so we should reserve 160 bytes before jumping into the main function.
Otherwise the main() function might write past the stack array.

While we're at it, add a proper STACK_SIZE macro for the stack size
instead of using magic numbers (this is also required for the following
patch).

Signed-off-by: Thomas Huth 
---
  pc-bios/s390-ccw/start.S | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/start.S b/pc-bios/s390-ccw/start.S
index d29de09cc6..29b0a9ece0 100644
--- a/pc-bios/s390-ccw/start.S
+++ b/pc-bios/s390-ccw/start.S
@@ -10,10 +10,12 @@
   * directory.
   */
  
+#define STACK_SIZE 0x8000

+
  .globl _start
  _start:
  
-larl%r15,stack + 0x8000 /* Set up stack */

+larl%r15,stack + STACK_SIZE - 160   /* Set up stack */


Reviewed-by: Christian Borntraeger

Re: [PATCH] pc-bios/s390-ccw/Makefile: Use -z noexecstack to silence linker warning

2023-06-22 Thread Christian Borntraeger


Am 22.06.23 um 15:08 schrieb Thomas Huth:

Recent versions of ld complain when linking the s390-ccw bios:

  /usr/bin/ld: warning: start.o: missing .note.GNU-stack section implies
   executable stack
  /usr/bin/ld: NOTE: This behaviour is deprecated and will be removed in
   a future version of the linker

We can silence the warning by telling the linker to mark the stack
as not executable.

Signed-off-by: Thomas Huth 
---
  pc-bios/s390-ccw/Makefile | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pc-bios/s390-ccw/Makefile b/pc-bios/s390-ccw/Makefile
index 2e8cc015aa..acfcd1e71a 100644
--- a/pc-bios/s390-ccw/Makefile
+++ b/pc-bios/s390-ccw/Makefile
@@ -55,7 +55,7 @@ config-cc.mak: Makefile
$(call cc-option,-march=z900,-march=z10)) 3> config-cc.mak
  -include config-cc.mak
  
-LDFLAGS += -Wl,-pie -nostdlib

+LDFLAGS += -Wl,-pie -nostdlib -z noexecstack
  
  build-all: s390-ccw.img s390-netboot.img


In the end this should not matter as the resulting binary is not loaded by an
elf loader so this should be fine
Acked-by: Christian Borntraeger

Re: s390 private runner CI job timing out

2023-04-06 Thread Christian Borntraeger





Am 06.04.23 um 14:39 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 13:30, Thomas Huth  wrote:

The thing is: it shouldn't take that long to build QEMU and run the tests
here, theoretically. Some days ago, the job was finishing in 39 minutes:

   https://gitlab.com/qemu-project/qemu/-/jobs/3973481571

The recent run took 74 minutes:

   https://gitlab.com/qemu-project/qemu/-/jobs/4066136770

That's almost a factor of two! So there is definitely something strange
going on.


So that 39 minute run was about 22 minutes compile, 17 minutes test.
The 74 minute run was 45 minutes compile, 30 minutes test.
The number of compile steps in meson was pretty much the same
(10379 vs 10384) in each case. So the most plausible conclusion
seems like "the VM mysteriously got slower by nearly a factor of 2",
given that the slowdown seems to affect the compile and test
stages about equally.

The VM has been up for 44 days, so we can rule out "rebooted into
a new kernel with some kind of perf bug".


Looks like the team is looking into some storage delays at the moment.
So it would be good to get some numbers for io wait and steal on the
next run to see if this is related to storage or CPU consumption.
Maybe collect /proc/stat before and after a run. Or login and use top.

Re: s390 private runner CI job timing out

2023-04-06 Thread Christian Borntraeger





Am 06.04.23 um 14:39 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 13:30, Thomas Huth  wrote:

The thing is: it shouldn't take that long to build QEMU and run the tests
here, theoretically. Some days ago, the job was finishing in 39 minutes:

   https://gitlab.com/qemu-project/qemu/-/jobs/3973481571

The recent run took 74 minutes:

   https://gitlab.com/qemu-project/qemu/-/jobs/4066136770

That's almost a factor of two! So there is definitely something strange
going on.


So that 39 minute run was about 22 minutes compile, 17 minutes test.
The 74 minute run was 45 minutes compile, 30 minutes test.
The number of compile steps in meson was pretty much the same
(10379 vs 10384) in each case. So the most plausible conclusion
seems like "the VM mysteriously got slower by nearly a factor of 2",
given that the slowdown seems to affect the compile and test
stages about equally.

The VM has been up for 44 days, so we can rule out "rebooted into
a new kernel with some kind of perf bug".


I will ask around if we can get a higher share of the system.

Re: s390 private runner CI job timing out

2023-04-06 Thread Christian Borntraeger





Am 06.04.23 um 14:30 schrieb Thomas Huth:

On 06/04/2023 14.13, Christian Borntraeger wrote:



Am 06.04.23 um 14:05 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 12:17, Christian Borntraeger
 wrote:


Am 06.04.23 um 12:44 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 11:40, Christian Borntraeger
 wrote:

Am 06.04.23 um 11:21 schrieb Peter Maydell:

Christian, does our S390X machine get a guaranteed amount of CPU,
or does it depend on what else is running on the hardware?


I think its a shared system with shared CPUs. Can you check the steal
time in top or proc? If this is far too high we could ask to give you
more weight for that VM.


It's idle at the moment and steal time seems to be low (0.0 .. 0.3);
I'll try to remember to check next time it's running a job.



Do you have /proc/stat ?


Yes; hopefully it means more to you than it does to me :-)

linux1@qemu01:~$ cat /proc/stat
cpu  60904459 604975 15052194 1435958176 17128179 351949 758578 22218760 0 0
cpu0 15022535 146734 3786909 358774818 4283172 98313 237156 5894809 0 0
cpu1 15306890 151164 3746024 358968957 4378864 85629 172492 5434255 0 0
cpu2 15307709 157180 3762691 359141276 4138714 85736 176367 5474594 0 0
cpu3 15267324 149895 3756569 359073124 4327428 82269 172562 5415101 0 0


This is
user,nice,system,idle,iowait,irq,softirq,steal,guest,guest_nice
So overall there is some (20-30% since the last reboot) steal going on.
Not sure if this is the real problem since it is only Ubuntu 20.04.
Does a higher timeout make the problem go away?


The thing is: it shouldn't take that long to build QEMU and run the tests here, 
theoretically. Some days ago, the job was finishing in 39 minutes:

  https://gitlab.com/qemu-project/qemu/-/jobs/3973481571

The recent run took 74 minutes:

  https://gitlab.com/qemu-project/qemu/-/jobs/4066136770

That's almost a factor of two! So there is definitely something strange going 
on.


But this has 786 instead of 659 tests, no?

Re: s390 private runner CI job timing out

2023-04-06 Thread Christian Borntraeger





Am 06.04.23 um 14:05 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 12:17, Christian Borntraeger
 wrote:


Am 06.04.23 um 12:44 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 11:40, Christian Borntraeger
 wrote:

Am 06.04.23 um 11:21 schrieb Peter Maydell:

Christian, does our S390X machine get a guaranteed amount of CPU,
or does it depend on what else is running on the hardware?


I think its a shared system with shared CPUs. Can you check the steal
time in top or proc? If this is far too high we could ask to give you
more weight for that VM.


It's idle at the moment and steal time seems to be low (0.0 .. 0.3);
I'll try to remember to check next time it's running a job.



Do you have /proc/stat ?


Yes; hopefully it means more to you than it does to me :-)

linux1@qemu01:~$ cat /proc/stat
cpu  60904459 604975 15052194 1435958176 17128179 351949 758578 22218760 0 0
cpu0 15022535 146734 3786909 358774818 4283172 98313 237156 5894809 0 0
cpu1 15306890 151164 3746024 358968957 4378864 85629 172492 5434255 0 0
cpu2 15307709 157180 3762691 359141276 4138714 85736 176367 5474594 0 0
cpu3 15267324 149895 3756569 359073124 4327428 82269 172562 5415101 0 0


This is
user,nice,system,idle,iowait,irq,softirq,steal,guest,guest_nice
So overall there is some (20-30% since the last reboot) steal going on.
Not sure if this is the real problem since it is only Ubuntu 20.04.
Does a higher timeout make the problem go away?

Re: s390 private runner CI job timing out

2023-04-06 Thread Christian Borntraeger


Am 06.04.23 um 12:44 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 11:40, Christian Borntraeger
 wrote:

Am 06.04.23 um 11:21 schrieb Peter Maydell:

Christian, does our S390X machine get a guaranteed amount of CPU,
or does it depend on what else is running on the hardware?


I think its a shared system with shared CPUs. Can you check the steal
time in top or proc? If this is far too high we could ask to give you
more weight for that VM.


It's idle at the moment and steal time seems to be low (0.0 .. 0.3);
I'll try to remember to check next time it's running a job.



Do you have /proc/stat ?

Re: s390 private runner CI job timing out

2023-04-06 Thread Christian Borntraeger





Am 06.04.23 um 11:21 schrieb Peter Maydell:

On Thu, 6 Apr 2023 at 07:57, Thomas Huth  wrote:


On 05/04/2023 17.15, Peter Maydell wrote:

The s390 private runner CI job ubuntu-20.04-s390x-all seems to have
started timing out a lot recently. Here's an example where it passed,
but with only 53 seconds left on the clock before it would have been
killed:

https://gitlab.com/qemu-project/qemu/-/jobs/4066136770

It looks like 'make check' was about 30 minutes of the 75 minutes
total, and compilation was 45 minutes.

Any suggestions for how we can trim this down? (Presumably we
could also raise the time limit given that this is a private
runner job...)


I don't have access to that system, so I can only guess: Did you check
whether there are any other processes still running (leftovers from earlier
test runs)?


I did check for that, as it's been a problem in the past, but
no, in this case no other jobs are running in the VM.


If not, it's maybe because it is a VM that is running with other
VMs in parallel that hog the CPU? In that case, you could contact the owner
of the machine and ask whether there  is anything that could be done about
it. Or simply increase the timeout a little bit more... (our highest timeout
in another job is 90 minutes, so I think that would still be OK?).


Christian, does our S390X machine get a guaranteed amount of CPU,
or does it depend on what else is running on the hardware?


I think its a shared system with shared CPUs. Can you check the steal
time in top or proc? If this is far too high we could ask to give you
more weight for that VM.

Re: [PATCH] s390x/gdb: Split s390-virt.xml

2023-03-14 Thread Christian Borntraeger





Am 13.03.23 um 22:16 schrieb Ilya Leoshkevich:

TCG emulates ckc, cputm, last_break and prefix, and it's quite useful
to have them during debugging.


KVM provides those as well so I dont get what you are trying to do here. (I 
would understand moving out the pfault things into a KVM section)



So move them into the new s390-virt-tcg.xml file.
 
pp, pfault_token, pfault_select and pfault_compare are not emulated,

so keep them in s390-virt.xml.

Signed-off-by: Ilya Leoshkevich 
---
  configs/targets/s390x-linux-user.mak |  2 +-
  configs/targets/s390x-softmmu.mak|  2 +-
  gdb-xml/s390-virt-tcg.xml| 14 +
  gdb-xml/s390-virt.xml|  4 --
  target/s390x/gdbstub.c   | 82 ++--
  5 files changed, 69 insertions(+), 35 deletions(-)
  create mode 100644 gdb-xml/s390-virt-tcg.xml

diff --git a/configs/targets/s390x-linux-user.mak 
b/configs/targets/s390x-linux-user.mak
index e2978248ede..fb3e2b73be7 100644
--- a/configs/targets/s390x-linux-user.mak
+++ b/configs/targets/s390x-linux-user.mak
@@ -2,4 +2,4 @@ TARGET_ARCH=s390x
  TARGET_SYSTBL_ABI=common,64
  TARGET_SYSTBL=syscall.tbl
  TARGET_BIG_ENDIAN=y
-TARGET_XML_FILES= gdb-xml/s390x-core64.xml gdb-xml/s390-acr.xml 
gdb-xml/s390-fpr.xml gdb-xml/s390-vx.xml gdb-xml/s390-cr.xml 
gdb-xml/s390-virt.xml gdb-xml/s390-gs.xml
+TARGET_XML_FILES= gdb-xml/s390x-core64.xml gdb-xml/s390-acr.xml 
gdb-xml/s390-fpr.xml gdb-xml/s390-vx.xml gdb-xml/s390-cr.xml 
gdb-xml/s390-virt.xml gdb-xml/s390-virt-tcg.xml gdb-xml/s390-gs.xml
diff --git a/configs/targets/s390x-softmmu.mak 
b/configs/targets/s390x-softmmu.mak
index 258b4cf3582..554330d7c85 100644
--- a/configs/targets/s390x-softmmu.mak
+++ b/configs/targets/s390x-softmmu.mak
@@ -1,4 +1,4 @@
  TARGET_ARCH=s390x
  TARGET_BIG_ENDIAN=y
  TARGET_SUPPORTS_MTTCG=y
-TARGET_XML_FILES= gdb-xml/s390x-core64.xml gdb-xml/s390-acr.xml 
gdb-xml/s390-fpr.xml gdb-xml/s390-vx.xml gdb-xml/s390-cr.xml 
gdb-xml/s390-virt.xml gdb-xml/s390-gs.xml
+TARGET_XML_FILES= gdb-xml/s390x-core64.xml gdb-xml/s390-acr.xml 
gdb-xml/s390-fpr.xml gdb-xml/s390-vx.xml gdb-xml/s390-cr.xml 
gdb-xml/s390-virt.xml gdb-xml/s390-virt-tcg.xml gdb-xml/s390-gs.xml
diff --git a/gdb-xml/s390-virt-tcg.xml b/gdb-xml/s390-virt-tcg.xml
new file mode 100644
index 000..0f77c9b48c6
--- /dev/null
+++ b/gdb-xml/s390-virt-tcg.xml
@@ -0,0 +1,14 @@
+
+
+
+
+
+  
+  
+  
+  
+
diff --git a/gdb-xml/s390-virt.xml b/gdb-xml/s390-virt.xml
index e2e9a7ad3cc..a79c0307682 100644
--- a/gdb-xml/s390-virt.xml
+++ b/gdb-xml/s390-virt.xml
@@ -7,10 +7,6 @@
  
  

  
-  
-  
-  
-  



diff --git a/target/s390x/gdbstub.c b/target/s390x/gdbstub.c
index a5d69d0e0bc..111b695dc85 100644
--- a/target/s390x/gdbstub.c
+++ b/target/s390x/gdbstub.c
@@ -200,61 +200,81 @@ static int cpu_write_c_reg(CPUS390XState *env, uint8_t 
*mem_buf, int n)
  }
  }
  
-/* the values represent the positions in s390-virt.xml */

-#define S390_VIRT_CKC_REGNUM0
-#define S390_VIRT_CPUTM_REGNUM  1
-#define S390_VIRT_BEA_REGNUM2
-#define S390_VIRT_PREFIX_REGNUM 3
-#define S390_VIRT_PP_REGNUM 4
-#define S390_VIRT_PFT_REGNUM5
-#define S390_VIRT_PFS_REGNUM6
-#define S390_VIRT_PFC_REGNUM7
-/* total number of registers in s390-virt.xml */
-#define S390_NUM_VIRT_REGS 8
+/* the values represent the positions in s390-virt-tcg.xml */
+#define S390_VIRT_TCG_CKC_REGNUM0
+#define S390_VIRT_TCG_CPUTM_REGNUM  1
+#define S390_VIRT_TCG_BEA_REGNUM2
+#define S390_VIRT_TCG_PREFIX_REGNUM 3
+/* total number of registers in s390-virt-tcg.xml */
+#define S390_NUM_VIRT_TCG_REGS 4
  
-static int cpu_read_virt_reg(CPUS390XState *env, GByteArray *mem_buf, int n)

+static int cpu_read_virt_tcg_reg(CPUS390XState *env, GByteArray *mem_buf, int 
n)
  {
  switch (n) {
-case S390_VIRT_CKC_REGNUM:
+case S390_VIRT_TCG_CKC_REGNUM:
  return gdb_get_regl(mem_buf, env->ckc);
-case S390_VIRT_CPUTM_REGNUM:
+case S390_VIRT_TCG_CPUTM_REGNUM:
  return gdb_get_regl(mem_buf, env->cputm);
-case S390_VIRT_BEA_REGNUM:
+case S390_VIRT_TCG_BEA_REGNUM:
  return gdb_get_regl(mem_buf, env->gbea);
-case S390_VIRT_PREFIX_REGNUM:
+case S390_VIRT_TCG_PREFIX_REGNUM:
  return gdb_get_regl(mem_buf, env->psa);
-case S390_VIRT_PP_REGNUM:
-return gdb_get_regl(mem_buf, env->pp);
-case S390_VIRT_PFT_REGNUM:
-return gdb_get_regl(mem_buf, env->pfault_token);
-case S390_VIRT_PFS_REGNUM:
-return gdb_get_regl(mem_buf, env->pfault_select);
-case S390_VIRT_PFC_REGNUM:
-return gdb_get_regl(mem_buf, env->pfault_compare);
  default:
  return 0;
  }
  }
  
-static int cpu_write_virt_reg(CPUS390XState *env, uint8_t *mem_buf, int n)

+static int cpu_write_virt_tcg_reg(CPUS390XState *env, uint8_t *mem_buf, int n)
  {
  switch (n) {
-case S390_VIRT_CKC_REGNUM:
+case S390_VIRT_TCG_CKC_REGNUM:
  env->ckc = ldtul_p(mem_buf);

Re: Taking some time

2023-02-16 Thread Christian Borntraeger


Thank you Mark for all your work! I think we met first at a zExpo or SHARE
(around 2005?) and you have always been a person that did a lot of work
helping others. Very much appreciated!

All the best for your future life :-)

Christian

Am 15.02.23 um 16:55 schrieb Mark Post:

All,

I'm retiring from SUSE. Today is my last day on the job. I intend to
stay involved with Linux on the mainframe, including this mailing list,
but I'm not sure just how involved. I'll know better after I've had some
time to adjust to my new life.

When I left EDS in 2007, the environment at the company had become very
toxic, with layoffs every quarter of good people with decades of
experience. Joining Novell/SUSE at that time was the best possible thing
that could have happened to me professionally. It's been very satisfying
to contribute to the creation of SUSE's products. It has literally been
a dream job for me.

I'm thankful to the many people that have helped me over the years.
Without them, I might never have been able to make working with
mainframe Linux my career for 20+ years.


Mark Post

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390


--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

Re: [PATCH] Makefile: allow 'make uninstall'

2023-01-10 Thread Christian Borntraeger


Am 10.01.23 um 16:13 schrieb Peter Maydell:

Meson supports an "uninstall", so we can easily allow it to work by
not suppressing the forwarding of it from Make to meson.

We originally suppressed this because Meson's 'uninstall' has a hole
in it: it will remove everything that is installed by a mechanism
meson knows about, but not things installed by "custom install
scripts", and there is no "custom uninstall script" mechanism.

For QEMU, though, the only thing that was being installed by a custom
install script was the LC_MESSAGES files handled by Meson's i18n
module, and that code was fixed in Meson commit 487d45c1e5bfff0fbdb4,
which is present in Meson 0.60.0 and later.  Since we already require
a Meson version newer than that, we're now safe to enable
'uninstall', as it will now correctly uninstall everything that was
installed.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/109
Signed-off-by: Peter Maydell 


Always missed that functionality. Thanks.

Re: [PATCH v13 0/7] s390x: CPU Topology

2022-12-13 Thread Christian Borntraeger





Am 13.12.22 um 14:50 schrieb Christian Borntraeger:



Am 12.12.22 um 11:01 schrieb Pierre Morel:



On 12/9/22 15:45, Cédric Le Goater wrote:

On 12/8/22 10:44, Pierre Morel wrote:

Hi,

Implementation discussions
==

CPU models
--

Since the S390_FEAT_CONFIGURATION_TOPOLOGY is already in the CPU model
for old QEMU we could not activate it as usual from KVM but needed
a KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
Checking and enabling this capability enables
S390_FEAT_CONFIGURATION_TOPOLOGY.

Migration
-

Once the S390_FEAT_CONFIGURATION_TOPOLOGY is enabled in the source
host the STFL(11) is provided to the guest.
Since the feature is already in the CPU model of older QEMU,
a migration from a new QEMU enabling the topology to an old QEMU
will keep STFL(11) enabled making the guest get an exception for
illegal operation as soon as it uses the PTF instruction.

A VMState keeping track of the S390_FEAT_CONFIGURATION_TOPOLOGY
allows to forbid the migration in such a case.

Note that the VMState will be used to hold information on the
topology once we implement topology change for a running guest.

Topology


Until we introduce bookss and drawers, polarization and dedication
the topology is kept very simple and is specified uniquely by
the core_id of the vCPU which is also the vCPU address.

Testing
===

To use the QEMU patches, you will need Linux V6-rc1 or newer,
or use the following Linux mainline patches:

f5ecfee94493 2022-07-20 KVM: s390: resetting the Topology-Change-Report
24fe0195bc19 2022-07-20 KVM: s390: guest support for topology function
0130337ec45b 2022-07-20 KVM: s390: Cleanup ipte lock access and SIIF fac..

Currently this code is for KVM only, I have no idea if it is interesting
to provide a TCG patch. If ever it will be done in another series.

Documentation
=

To have a better understanding of the S390x CPU Topology and its
implementation in QEMU you can have a look at the documentation in the
last patch of this series.

The admin will want to match the host and the guest topology, taking
into account that the guest does not recognize multithreading.
Consequently, two vCPU assigned to threads of the same real CPU should
preferably be assigned to the same socket of the guest machine.

Future developments
===

Two series are actively prepared:
- Adding drawers, book, polarization and dedication to the vCPU.
- changing the topology with a running guest


Since we have time before the next QEMU 8.0 release, could you please
send the whole patchset ? Having the full picture would help in taking
decisions for downstream also.

I am still uncertain about the usefulness of S390Topology object because,
as for now, the state can be computed on the fly, the reset can be
handled at the machine level directly under s390_machine_reset() and so
could migration if the machine had a vmstate (not the case today but
quite easy to add). So before merging anything that could be difficult
to maintain and/or backport, I would prefer to see it all !



The goal of dedicating an object for topology was to ease the maintenance and 
portability by using the QEMU object framework.

If on contrary it is a problem for maintenance or backport we surely better not 
use it.

Any other opinion?


I agree with Cedric. There is no point in a topology object.
The state is calculated on the fly depending on the command line. This
would change if we ever implement the PTF horizontal/vertical state. But this
can then be another CPU state.

So I think we could go forward with this patch as a simple patch set that 
allows to
sets a static topology. This makes sense on its own, e.g. if you plan to pin 
your
vCPUs to given host CPUs.

Dynamic changes do come with CPU hotplug, not sure what libvirt does with new 
CPUs
in that case during migration. I assume those are re-generated on the target 
with
whatever topology was created on the source. So I guess this will just work, but
it would be good if we could test that.

A more fine-grained topology (drawer, book) could be added later or upfront. It
does require common code and libvirt enhancements, though.


Now I have discussed with Pierre and there IS a state that we want to migrate.
The topology change state is a guest wide bit that might be still set when
topology is changed->bit is set
guest is not yet started
migration

2 options:
1. a vmstate with that bit and migrate the state
2. always set the topology change bit after migration

Re: [PATCH v13 0/7] s390x: CPU Topology

2022-12-13 Thread Christian Borntraeger





Am 13.12.22 um 14:57 schrieb Janis Schoetterl-Glausch:

On Tue, 2022-12-13 at 14:41 +0100, Christian Borntraeger wrote:


Am 12.12.22 um 11:17 schrieb Thomas Huth:

On 12/12/2022 11.10, Pierre Morel wrote:



On 12/12/22 10:07, Thomas Huth wrote:

On 12/12/2022 09.51, Pierre Morel wrote:



On 12/9/22 14:32, Thomas Huth wrote:

On 08/12/2022 10.44, Pierre Morel wrote:

Hi,

Implementation discussions
==

CPU models
--

Since the S390_FEAT_CONFIGURATION_TOPOLOGY is already in the CPU model
for old QEMU we could not activate it as usual from KVM but needed
a KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
Checking and enabling this capability enables
S390_FEAT_CONFIGURATION_TOPOLOGY.

Migration
-

Once the S390_FEAT_CONFIGURATION_TOPOLOGY is enabled in the source
host the STFL(11) is provided to the guest.
Since the feature is already in the CPU model of older QEMU,
a migration from a new QEMU enabling the topology to an old QEMU
will keep STFL(11) enabled making the guest get an exception for
illegal operation as soon as it uses the PTF instruction.


I now thought that it is not possible to enable "ctop" on older QEMUs since the 
don't enable the KVM capability? ... or is it still somehow possible? What did I miss?

   Thomas


Enabling ctop with ctop=on on old QEMU is not possible, this is right.
But, if STFL(11) is enable in the source KVM by a new QEMU, I can see that even 
with -ctop=off the STFL(11) is migrated to the destination.


This does not make sense. the cpu model and stfle values are not migrated. This 
is re-created during startup depending on the command line parameters of -cpu.
Thats why source and host have the same command lines for -cpu. And STFLE.11 
must not be set on the SOURCE of ctop is off.


What about linux? I didn't look to thoroughly at it, but it caches the stfle 
bits, doesn't it?
So if the migration succeeds, even tho it should not, it will look to the guest 
like the facility is enabled.


That is true, but the migration should not succeed in that case unless you use 
an unsafe way of migrating. And the cpu model was exactly added to block those 
unsafe way.
One thing:
-cpu host is unsafe (host-passthrough in libvirt speak). Either use host-model 
or a fully specified model like z14.2,ctop=on

Re: [PATCH v13 0/7] s390x: CPU Topology

2022-12-13 Thread Christian Borntraeger





Am 12.12.22 um 11:01 schrieb Pierre Morel:



On 12/9/22 15:45, Cédric Le Goater wrote:

On 12/8/22 10:44, Pierre Morel wrote:

Hi,

Implementation discussions
==

CPU models
--

Since the S390_FEAT_CONFIGURATION_TOPOLOGY is already in the CPU model
for old QEMU we could not activate it as usual from KVM but needed
a KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
Checking and enabling this capability enables
S390_FEAT_CONFIGURATION_TOPOLOGY.

Migration
-

Once the S390_FEAT_CONFIGURATION_TOPOLOGY is enabled in the source
host the STFL(11) is provided to the guest.
Since the feature is already in the CPU model of older QEMU,
a migration from a new QEMU enabling the topology to an old QEMU
will keep STFL(11) enabled making the guest get an exception for
illegal operation as soon as it uses the PTF instruction.

A VMState keeping track of the S390_FEAT_CONFIGURATION_TOPOLOGY
allows to forbid the migration in such a case.

Note that the VMState will be used to hold information on the
topology once we implement topology change for a running guest.

Topology


Until we introduce bookss and drawers, polarization and dedication
the topology is kept very simple and is specified uniquely by
the core_id of the vCPU which is also the vCPU address.

Testing
===

To use the QEMU patches, you will need Linux V6-rc1 or newer,
or use the following Linux mainline patches:

f5ecfee94493 2022-07-20 KVM: s390: resetting the Topology-Change-Report
24fe0195bc19 2022-07-20 KVM: s390: guest support for topology function
0130337ec45b 2022-07-20 KVM: s390: Cleanup ipte lock access and SIIF fac..

Currently this code is for KVM only, I have no idea if it is interesting
to provide a TCG patch. If ever it will be done in another series.

Documentation
=

To have a better understanding of the S390x CPU Topology and its
implementation in QEMU you can have a look at the documentation in the
last patch of this series.

The admin will want to match the host and the guest topology, taking
into account that the guest does not recognize multithreading.
Consequently, two vCPU assigned to threads of the same real CPU should
preferably be assigned to the same socket of the guest machine.

Future developments
===

Two series are actively prepared:
- Adding drawers, book, polarization and dedication to the vCPU.
- changing the topology with a running guest


Since we have time before the next QEMU 8.0 release, could you please
send the whole patchset ? Having the full picture would help in taking
decisions for downstream also.

I am still uncertain about the usefulness of S390Topology object because,
as for now, the state can be computed on the fly, the reset can be
handled at the machine level directly under s390_machine_reset() and so
could migration if the machine had a vmstate (not the case today but
quite easy to add). So before merging anything that could be difficult
to maintain and/or backport, I would prefer to see it all !



The goal of dedicating an object for topology was to ease the maintenance and 
portability by using the QEMU object framework.

If on contrary it is a problem for maintenance or backport we surely better not 
use it.

Any other opinion?


I agree with Cedric. There is no point in a topology object.
The state is calculated on the fly depending on the command line. This
would change if we ever implement the PTF horizontal/vertical state. But this
can then be another CPU state.

So I think we could go forward with this patch as a simple patch set that 
allows to
sets a static topology. This makes sense on its own, e.g. if you plan to pin 
your
vCPUs to given host CPUs.

Dynamic changes do come with CPU hotplug, not sure what libvirt does with new 
CPUs
in that case during migration. I assume those are re-generated on the target 
with
whatever topology was created on the source. So I guess this will just work, but
it would be good if we could test that.

A more fine-grained topology (drawer, book) could be added later or upfront. It
does require common code and libvirt enhancements, though.

Re: [PATCH v13 0/7] s390x: CPU Topology

2022-12-13 Thread Christian Borntraeger





Am 12.12.22 um 11:17 schrieb Thomas Huth:

On 12/12/2022 11.10, Pierre Morel wrote:



On 12/12/22 10:07, Thomas Huth wrote:

On 12/12/2022 09.51, Pierre Morel wrote:



On 12/9/22 14:32, Thomas Huth wrote:

On 08/12/2022 10.44, Pierre Morel wrote:

Hi,

Implementation discussions
==

CPU models
--

Since the S390_FEAT_CONFIGURATION_TOPOLOGY is already in the CPU model
for old QEMU we could not activate it as usual from KVM but needed
a KVM capability: KVM_CAP_S390_CPU_TOPOLOGY.
Checking and enabling this capability enables
S390_FEAT_CONFIGURATION_TOPOLOGY.

Migration
-

Once the S390_FEAT_CONFIGURATION_TOPOLOGY is enabled in the source
host the STFL(11) is provided to the guest.
Since the feature is already in the CPU model of older QEMU,
a migration from a new QEMU enabling the topology to an old QEMU
will keep STFL(11) enabled making the guest get an exception for
illegal operation as soon as it uses the PTF instruction.


I now thought that it is not possible to enable "ctop" on older QEMUs since the 
don't enable the KVM capability? ... or is it still somehow possible? What did I miss?

  Thomas


Enabling ctop with ctop=on on old QEMU is not possible, this is right.
But, if STFL(11) is enable in the source KVM by a new QEMU, I can see that even 
with -ctop=off the STFL(11) is migrated to the destination.


This does not make sense. the cpu model and stfle values are not migrated. This 
is re-created during startup depending on the command line parameters of -cpu.
Thats why source and host have the same command lines for -cpu. And STFLE.11 
must not be set on the SOURCE of ctop is off.




Is this with the "host" CPU model or another one? And did you explicitly specify 
"ctop=off" at the command line, or are you just using the default setting by not 
specifying it?


With explicit cpumodel and using ctop=off like in

sudo /usr/local/bin/qemu-system-s390x_master \
  -m 512M \
  -enable-kvm -smp 4,sockets=4,cores=2,maxcpus=8 \
  -cpu z14,ctop=off \
  -machine s390-ccw-virtio-7.2,accel=kvm \
  ...


Ok ... that sounds like a bug somewhere in your patches or in the kernel code 
... the guest should never see STFL bit 11 if ctop=off, should it?


Correct. If ctop=off then QEMU should disable STFLE.11 for the CPU model.

Re: [PATCH v13 4/7] s390x/cpu_topology: CPU topology migration

2022-12-13 Thread Christian Borntraeger


Am 08.12.22 um 10:44 schrieb Pierre Morel:

The migration can only take place if both source and destination
of the migration both use or both do not use the CPU topology
facility.

We indicate a change in topology during migration postload for the
case the topology changed between source and destination.


I dont get why we need this? If the target QEMU has topology it should
already create this according to the configuration. WHy do we need a
trigger?



Signed-off-by: Pierre Morel 
---
  target/s390x/cpu.h|  1 +
  hw/s390x/cpu-topology.c   | 49 +++
  target/s390x/cpu-sysemu.c |  8 +++
  3 files changed, 58 insertions(+)

diff --git a/target/s390x/cpu.h b/target/s390x/cpu.h
index bc1a7de932..284c708a6c 100644
--- a/target/s390x/cpu.h
+++ b/target/s390x/cpu.h
@@ -854,6 +854,7 @@ void s390_do_cpu_set_diag318(CPUState *cs, run_on_cpu_data 
arg);
  int s390_assign_subch_ioeventfd(EventNotifier *notifier, uint32_t sch_id,
  int vq, bool assign);
  void s390_cpu_topology_reset(void);
+int s390_cpu_topology_mtcr_set(void);
  #ifndef CONFIG_USER_ONLY
  unsigned int s390_cpu_set_state(uint8_t cpu_state, S390CPU *cpu);
  #else
diff --git a/hw/s390x/cpu-topology.c b/hw/s390x/cpu-topology.c
index f54afcf550..8a2fe041d4 100644
--- a/hw/s390x/cpu-topology.c
+++ b/hw/s390x/cpu-topology.c
@@ -18,6 +18,7 @@
  #include "target/s390x/cpu.h"
  #include "hw/s390x/s390-virtio-ccw.h"
  #include "hw/s390x/cpu-topology.h"
+#include "migration/vmstate.h"
  
  /**

   * s390_has_topology
@@ -129,6 +130,53 @@ static void s390_topology_reset(DeviceState *dev)
  s390_cpu_topology_reset();
  }
  
+/**

+ * cpu_topology_postload
+ * @opaque: a pointer to the S390Topology
+ * @version_id: version identifier
+ *
+ * We check that the topology is used or is not used
+ * on both side identically.
+ *
+ * If the topology is in use we set the Modified Topology Change Report
+ * on the destination host.
+ */
+static int cpu_topology_postload(void *opaque, int version_id)
+{
+int ret;
+
+/* We do not support CPU Topology, all is good */
+if (!s390_has_topology()) {
+return 0;
+}
+
+/* We support CPU Topology, set the MTCR unconditionally */
+ret = s390_cpu_topology_mtcr_set();
+if (ret) {
+error_report("Failed to set MTCR: %s", strerror(-ret));
+}
+return ret;
+}
+
+/**
+ * cpu_topology_needed:
+ * @opaque: The pointer to the S390Topology
+ *
+ * We always need to know if source and destination use the topology.
+ */
+static bool cpu_topology_needed(void *opaque)
+{
+return s390_has_topology();
+}
+
+const VMStateDescription vmstate_cpu_topology = {
+.name = "cpu_topology",
+.version_id = 1,
+.post_load = cpu_topology_postload,
+.minimum_version_id = 1,
+.needed = cpu_topology_needed,
+};
+
  /**
   * topology_class_init:
   * @oc: Object class
@@ -145,6 +193,7 @@ static void topology_class_init(ObjectClass *oc, void *data)
  device_class_set_props(dc, s390_topology_properties);
  set_bit(DEVICE_CATEGORY_MISC, dc->categories);
  dc->reset = s390_topology_reset;
+dc->vmsd = _cpu_topology;
  }
  
  static const TypeInfo cpu_topology_info = {

diff --git a/target/s390x/cpu-sysemu.c b/target/s390x/cpu-sysemu.c
index e27864c5f5..a8e3e6219d 100644
--- a/target/s390x/cpu-sysemu.c
+++ b/target/s390x/cpu-sysemu.c
@@ -319,3 +319,11 @@ void s390_cpu_topology_reset(void)
  }
  }
  }
+
+int s390_cpu_topology_mtcr_set(void)
+{
+if (kvm_enabled()) {
+return kvm_s390_topology_set_mtcr(1);
+}
+return -ENOENT;
+}

Re: [PATCH 1/1] qemu-iotests/stream-under-throttle: do not shutdown QEMU

2022-12-12 Thread Christian Borntraeger





Am 07.12.22 um 14:14 schrieb Christian Borntraeger:

Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.

Signed-off-by: Christian Borntraeger 
---
  tests/qemu-iotests/tests/stream-under-throttle | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/tests/stream-under-throttle 
b/tests/qemu-iotests/tests/stream-under-throttle
index 8d2d9e16840d..c24dfbcaa2f2 100755
--- a/tests/qemu-iotests/tests/stream-under-throttle
+++ b/tests/qemu-iotests/tests/stream-under-throttle
@@ -88,6 +88,8 @@ class TestStreamWithThrottle(iotests.QMPTestCase):
 'x-iops-total=1,x-bps-total=104857600')
  self.vm.add_blockdev(self.vm.qmp_to_opts(blockdev))
  self.vm.add_device('virtio-blk,iothread=iothr0,drive=throttled-node')
+if iotests.qemu_default_machine == 's390-ccw-virtio':
+self.vm.add_args('-no-shutdown')
  self.vm.launch()
  
  def tearDown(self) -> None:



ping. I guess, this will come after the release?

Re: [PATCH 1/1] qemu-iotests/stream-under-throttle: do not shutdown QEMU

2022-12-12 Thread Christian Borntraeger





Am 07.12.22 um 14:14 schrieb Christian Borntraeger:

Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.

Signed-off-by: Christian Borntraeger 
---
  tests/qemu-iotests/tests/stream-under-throttle | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/tests/stream-under-throttle 
b/tests/qemu-iotests/tests/stream-under-throttle
index 8d2d9e16840d..c24dfbcaa2f2 100755
--- a/tests/qemu-iotests/tests/stream-under-throttle
+++ b/tests/qemu-iotests/tests/stream-under-throttle
@@ -88,6 +88,8 @@ class TestStreamWithThrottle(iotests.QMPTestCase):
 'x-iops-total=1,x-bps-total=104857600')
  self.vm.add_blockdev(self.vm.qmp_to_opts(blockdev))
  self.vm.add_device('virtio-blk,iothread=iothr0,drive=throttled-node')
+if iotests.qemu_default_machine == 's390-ccw-virtio':
+self.vm.add_args('-no-shutdown')
  self.vm.launch()
  
  def tearDown(self) -> None:



ping. I guess, this will come after the release?

Re: z/OS equivalent to z/VM's Link Aggregation

2022-12-08 Thread Christian Borntraeger


Am 08.12.22 um 17:44 schrieb Jim Elliott:

I am sure that z/OS must provide a function equivalent to z/VM's Link
Aggregation (where you can share physical OSA ports in a VSwitch), but lots
of Google searches are not helping me. Any suggestions?



Link aggregration is on layer2 (ethernet) and used by Linux/zVM while z/OS talks
to OSA in layer3. So I guess the equivalent would be VIPA.

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Re: [PATCH v3 02/13] tcg/s390x: Remove TCG_REG_TB

2022-12-07 Thread Christian Borntraeger





Am 07.12.22 um 21:40 schrieb Ilya Leoshkevich:

On Wed, 2022-12-07 at 08:55 -0600, Richard Henderson wrote:

On 12/7/22 01:45, Thomas Huth wrote:

On 06/12/2022 23.22, Richard Henderson wrote:

On 12/6/22 13:29, Ilya Leoshkevich wrote:

This change doesn't seem to affect that, but what is the
minimum
supported s390x qemu host? z900?


Possibly z990, if I'm reading the gcc processor_flags_table[]
correctly;
long-displacement-facility is definitely a minimum.

We probably should revisit what the minimum for TCG should be,
assert those features at
startup, and drop the corresponding runtime tests.


If we consider the official IBM support statement:

https://www.ibm.com/support/pages/system/files/inline-files/IBM%20Mainframe%20Life%20Cycle%20History%20V2.10%20-%20Sept%2013%202022_1.pdf

... that would mean that the z10 and all older machines are not
supported anymore.


Thanks for the pointer.  It would appear that z114 exits support at
the end of this month,
which would leave z12 as minimum supported cpu.

Even assuming z196 gets us extended-immediate, general-insn-
extension, load-on-condition,
and distinct-operands, which are all quite important to TCG, and
constitute almost all of
the current runtime checks.

The other metric would be matching the set of supported cpus from the
set of supported os
distributions, but I would be ready to believe z196 is below the
minimum there too.


r~


I think it should be safe to raise the minimum required hardware for
TCG to z196:


We recently raised the minimum hardware for the Linux kernel to be z10, so that 
would be super-safe, but z196 is certainly a sane minimum.


* The oldest supported RHEL is v7, it requires z196:

https://access.redhat.com/product-life-cycles/
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/installation_guide/chap-installation-planning-s390

* The oldest supported SLES is v12, it requires z196:

https://www.suse.com/de-de/lifecycle/
https://documentation.suse.com/sles/12-SP5/html/SLES-all/cha-zseries.html

* The oldest supported Ubuntu is v16.04, it requires zEC12+:
https://wiki.ubuntu.com/S390X

Best regards,
Ilya

Re: [PATCH 1/1] qemu-iotests/stream-under-throttle: do not shutdown QEMU

2022-12-07 Thread Christian Borntraeger





Am 07.12.22 um 14:23 schrieb Thomas Huth:

On 07/12/2022 14.14, Christian Borntraeger wrote:

Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.

Signed-off-by: Christian Borntraeger 
---
  tests/qemu-iotests/tests/stream-under-throttle | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/tests/stream-under-throttle 
b/tests/qemu-iotests/tests/stream-under-throttle
index 8d2d9e16840d..c24dfbcaa2f2 100755
--- a/tests/qemu-iotests/tests/stream-under-throttle
+++ b/tests/qemu-iotests/tests/stream-under-throttle
@@ -88,6 +88,8 @@ class TestStreamWithThrottle(iotests.QMPTestCase):
 'x-iops-total=1,x-bps-total=104857600')
  self.vm.add_blockdev(self.vm.qmp_to_opts(blockdev))
  self.vm.add_device('virtio-blk,iothread=iothr0,drive=throttled-node')
+    if iotests.qemu_default_machine == 's390-ccw-virtio':
+    self.vm.add_args('-no-shutdown')
  self.vm.launch()


I guess you could even add that unconditionally for all architectures?


maybe. It might even fix other architecture with the same problem. But I dont 
know if thats the case.
So we can start with this fix and then remove the if at a later point in time 
if necessary/useful.


Anyway:
Reviewed-by: Thomas Huth

Re: [PATCH 1/1] qemu-iotests/stream-under-throttle: do not shutdown QEMU

2022-12-07 Thread Christian Borntraeger





Am 07.12.22 um 14:23 schrieb Thomas Huth:

On 07/12/2022 14.14, Christian Borntraeger wrote:

Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.

Signed-off-by: Christian Borntraeger 
---
  tests/qemu-iotests/tests/stream-under-throttle | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/tests/stream-under-throttle 
b/tests/qemu-iotests/tests/stream-under-throttle
index 8d2d9e16840d..c24dfbcaa2f2 100755
--- a/tests/qemu-iotests/tests/stream-under-throttle
+++ b/tests/qemu-iotests/tests/stream-under-throttle
@@ -88,6 +88,8 @@ class TestStreamWithThrottle(iotests.QMPTestCase):
 'x-iops-total=1,x-bps-total=104857600')
  self.vm.add_blockdev(self.vm.qmp_to_opts(blockdev))
  self.vm.add_device('virtio-blk,iothread=iothr0,drive=throttled-node')
+    if iotests.qemu_default_machine == 's390-ccw-virtio':
+    self.vm.add_args('-no-shutdown')
  self.vm.launch()


I guess you could even add that unconditionally for all architectures?


maybe. It might even fix other architecture with the same problem. But I dont 
know if thats the case.
So we can start with this fix and then remove the if at a later point in time 
if necessary/useful.


Anyway:
Reviewed-by: Thomas Huth

[PATCH 1/1] qemu-iotests/stream-under-throttle: do not shutdown QEMU

2022-12-07 Thread Christian Borntraeger

Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.

Signed-off-by: Christian Borntraeger 
---
 tests/qemu-iotests/tests/stream-under-throttle | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/tests/stream-under-throttle 
b/tests/qemu-iotests/tests/stream-under-throttle
index 8d2d9e16840d..c24dfbcaa2f2 100755
--- a/tests/qemu-iotests/tests/stream-under-throttle
+++ b/tests/qemu-iotests/tests/stream-under-throttle
@@ -88,6 +88,8 @@ class TestStreamWithThrottle(iotests.QMPTestCase):
'x-iops-total=1,x-bps-total=104857600')
 self.vm.add_blockdev(self.vm.qmp_to_opts(blockdev))
 self.vm.add_device('virtio-blk,iothread=iothr0,drive=throttled-node')
+if iotests.qemu_default_machine == 's390-ccw-virtio':
+self.vm.add_args('-no-shutdown')
 self.vm.launch()
 
 def tearDown(self) -> None:
-- 
2.38.1

[PATCH 1/1] qemu-iotests/stream-under-throttle: do not shutdown QEMU

2022-12-07 Thread Christian Borntraeger

Without a kernel or boot disk a QEMU on s390 will exit (usually with a
disabled wait state). This breaks the stream-under-throttle test case.
Do not exit qemu if on s390.

Signed-off-by: Christian Borntraeger 
---
 tests/qemu-iotests/tests/stream-under-throttle | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/qemu-iotests/tests/stream-under-throttle 
b/tests/qemu-iotests/tests/stream-under-throttle
index 8d2d9e16840d..c24dfbcaa2f2 100755
--- a/tests/qemu-iotests/tests/stream-under-throttle
+++ b/tests/qemu-iotests/tests/stream-under-throttle
@@ -88,6 +88,8 @@ class TestStreamWithThrottle(iotests.QMPTestCase):
'x-iops-total=1,x-bps-total=104857600')
 self.vm.add_blockdev(self.vm.qmp_to_opts(blockdev))
 self.vm.add_device('virtio-blk,iothread=iothr0,drive=throttled-node')
+if iotests.qemu_default_machine == 's390-ccw-virtio':
+self.vm.add_args('-no-shutdown')
 self.vm.launch()
 
 def tearDown(self) -> None:
-- 
2.38.1

Re: [PATCH v2] tests/stream-under-throttle: New test

2022-12-07 Thread Christian Borntraeger





Am 07.12.22 um 13:56 schrieb Christian Borntraeger:


Am 07.12.22 um 11:31 schrieb Christian Borntraeger:

Am 14.11.22 um 10:52 schrieb Hanna Reitz:

Test streaming a base image into the top image underneath two throttle
nodes.  This was reported to make qemu 7.1 hang
(https://gitlab.com/qemu-project/qemu/-/issues/1215), so this serves as
a regression test.

Signed-off-by: Hanna Reitz 
---
Based-on: <20221107151321.211175-1-hre...@redhat.com>

v1: https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00368.html

v2:
- Replace `asyncio.exceptions.TimeoutError` by `asyncio.TimeoutError`:
   Stefan reported that the CI does not recognize the former:
   https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00424.html

   As far as I understand, the latter was basically moved to become the
   former in Python 3.11, and an alias remains, so both are basically
   equivalent.  I only have 3.10, though, where the documentation says
   that both are different, even though using either seems to work fine
   (i.e. both catch the timeout there).  Not sure about previous
   versions, but the CI seems pretty certain about not knowing
   `asyncio.exceptions.TimeoutError`, so use `asyncio.TimeoutError`
   instead.  (Even though that is deprecated in 3.11, but this is not the
   first place in the tree to use it, so it should not be too bad.)
---
  .../qemu-iotests/tests/stream-under-throttle  | 121 ++
  .../tests/stream-under-throttle.out   |   5 +
  2 files changed, 126 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/stream-under-throttle
  create mode 100644 tests/qemu-iotests/tests/stream-under-throttle.out


As a heads up, I do get the following on s390. I have not yet looked into that:

+EE
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/tests/qemu-iotests/tests/stream-under-throttle", line 110, in 
test_stream
+    self.vm.run_job('stream')
+  File "qemu/tests/qemu-iotests/iotests.py", line 986, in run_job
+    result = self.qmp('query-jobs')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 533, in _soft_shutdown
+    self.qmp('quit')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 554, in _do_shutdown
+    self._soft_shutdown(timeout)
+  File "qemu/python/qemu/machine/machine.py", line 536, in _soft_shutdown
+    self._close_qmp_connection()
+  File "qemu/python/qemu/machine/machine.py", line 476, in 
_close_qmp_connection
+    self._qmp.close()
+  File "qemu/python/qemu/qmp/legacy.py", line 277, in close
+    self._sync(
+  File "qemu/python/qemu/qmp/legacy.py", line 94, in _sync
+    return self._aloop.run_until_complete(
+  File "/usr/lib64/python3.10/asyncio/base_events.py", line 649, in 
run_until_complete
+    return future.result()
+  File "/usr/lib64/python3.10/asyncio/tasks.py", line 408, in wait_for
+    return await fut
+  File "qemu/python/qemu/qmp/protocol.py", line 398, in disconnect
+    await self._wait_disconnect()
+  File "qemu/python/qemu/qmp/protocol.py", line 710, in _wait_disconnect
+    await al

Re: [PATCH v2] tests/stream-under-throttle: New test

2022-12-07 Thread Christian Borntraeger





Am 07.12.22 um 13:56 schrieb Christian Borntraeger:


Am 07.12.22 um 11:31 schrieb Christian Borntraeger:

Am 14.11.22 um 10:52 schrieb Hanna Reitz:

Test streaming a base image into the top image underneath two throttle
nodes.  This was reported to make qemu 7.1 hang
(https://gitlab.com/qemu-project/qemu/-/issues/1215), so this serves as
a regression test.

Signed-off-by: Hanna Reitz 
---
Based-on: <20221107151321.211175-1-hre...@redhat.com>

v1: https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00368.html

v2:
- Replace `asyncio.exceptions.TimeoutError` by `asyncio.TimeoutError`:
   Stefan reported that the CI does not recognize the former:
   https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00424.html

   As far as I understand, the latter was basically moved to become the
   former in Python 3.11, and an alias remains, so both are basically
   equivalent.  I only have 3.10, though, where the documentation says
   that both are different, even though using either seems to work fine
   (i.e. both catch the timeout there).  Not sure about previous
   versions, but the CI seems pretty certain about not knowing
   `asyncio.exceptions.TimeoutError`, so use `asyncio.TimeoutError`
   instead.  (Even though that is deprecated in 3.11, but this is not the
   first place in the tree to use it, so it should not be too bad.)
---
  .../qemu-iotests/tests/stream-under-throttle  | 121 ++
  .../tests/stream-under-throttle.out   |   5 +
  2 files changed, 126 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/stream-under-throttle
  create mode 100644 tests/qemu-iotests/tests/stream-under-throttle.out


As a heads up, I do get the following on s390. I have not yet looked into that:

+EE
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/tests/qemu-iotests/tests/stream-under-throttle", line 110, in 
test_stream
+    self.vm.run_job('stream')
+  File "qemu/tests/qemu-iotests/iotests.py", line 986, in run_job
+    result = self.qmp('query-jobs')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 533, in _soft_shutdown
+    self.qmp('quit')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 554, in _do_shutdown
+    self._soft_shutdown(timeout)
+  File "qemu/python/qemu/machine/machine.py", line 536, in _soft_shutdown
+    self._close_qmp_connection()
+  File "qemu/python/qemu/machine/machine.py", line 476, in 
_close_qmp_connection
+    self._qmp.close()
+  File "qemu/python/qemu/qmp/legacy.py", line 277, in close
+    self._sync(
+  File "qemu/python/qemu/qmp/legacy.py", line 94, in _sync
+    return self._aloop.run_until_complete(
+  File "/usr/lib64/python3.10/asyncio/base_events.py", line 649, in 
run_until_complete
+    return future.result()
+  File "/usr/lib64/python3.10/asyncio/tasks.py", line 408, in wait_for
+    return await fut
+  File "qemu/python/qemu/qmp/protocol.py", line 398, in disconnect
+    await self._wait_disconnect()
+  File "qemu/python/qemu/qmp/protocol.py", line 710, in _wait_disconnect
+    await al

Re: [PATCH v2] tests/stream-under-throttle: New test

2022-12-07 Thread Christian Borntraeger




Am 07.12.22 um 11:31 schrieb Christian Borntraeger:

Am 14.11.22 um 10:52 schrieb Hanna Reitz:

Test streaming a base image into the top image underneath two throttle
nodes.  This was reported to make qemu 7.1 hang
(https://gitlab.com/qemu-project/qemu/-/issues/1215), so this serves as
a regression test.

Signed-off-by: Hanna Reitz 
---
Based-on: <20221107151321.211175-1-hre...@redhat.com>

v1: https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00368.html

v2:
- Replace `asyncio.exceptions.TimeoutError` by `asyncio.TimeoutError`:
   Stefan reported that the CI does not recognize the former:
   https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00424.html

   As far as I understand, the latter was basically moved to become the
   former in Python 3.11, and an alias remains, so both are basically
   equivalent.  I only have 3.10, though, where the documentation says
   that both are different, even though using either seems to work fine
   (i.e. both catch the timeout there).  Not sure about previous
   versions, but the CI seems pretty certain about not knowing
   `asyncio.exceptions.TimeoutError`, so use `asyncio.TimeoutError`
   instead.  (Even though that is deprecated in 3.11, but this is not the
   first place in the tree to use it, so it should not be too bad.)
---
  .../qemu-iotests/tests/stream-under-throttle  | 121 ++
  .../tests/stream-under-throttle.out   |   5 +
  2 files changed, 126 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/stream-under-throttle
  create mode 100644 tests/qemu-iotests/tests/stream-under-throttle.out


As a heads up, I do get the following on s390. I have not yet looked into that:

+EE
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/tests/qemu-iotests/tests/stream-under-throttle", line 110, in 
test_stream
+    self.vm.run_job('stream')
+  File "qemu/tests/qemu-iotests/iotests.py", line 986, in run_job
+    result = self.qmp('query-jobs')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 533, in _soft_shutdown
+    self.qmp('quit')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 554, in _do_shutdown
+    self._soft_shutdown(timeout)
+  File "qemu/python/qemu/machine/machine.py", line 536, in _soft_shutdown
+    self._close_qmp_connection()
+  File "qemu/python/qemu/machine/machine.py", line 476, in 
_close_qmp_connection
+    self._qmp.close()
+  File "qemu/python/qemu/qmp/legacy.py", line 277, in close
+    self._sync(
+  File "qemu/python/qemu/qmp/legacy.py", line 94, in _sync
+    return self._aloop.run_until_complete(
+  File "/usr/lib64/python3.10/asyncio/base_events.py", line 649, in 
run_until_complete
+    return future.result()
+  File "/usr/lib64/python3.10/asyncio/tasks.py", line 408, in wait_for
+    return await fut
+  File "qemu/python/qemu/qmp/protocol.py", line 398, in disconnect
+    await self._wait_disconnect()
+  File "qemu/python/qemu/qmp/protocol.py", line 710, in _wait_disconnect
+    await all_defined_tasks  # Raise Exceptions from the bottom half.
+

Re: [PATCH v2] tests/stream-under-throttle: New test

2022-12-07 Thread Christian Borntraeger




Am 07.12.22 um 11:31 schrieb Christian Borntraeger:

Am 14.11.22 um 10:52 schrieb Hanna Reitz:

Test streaming a base image into the top image underneath two throttle
nodes.  This was reported to make qemu 7.1 hang
(https://gitlab.com/qemu-project/qemu/-/issues/1215), so this serves as
a regression test.

Signed-off-by: Hanna Reitz 
---
Based-on: <20221107151321.211175-1-hre...@redhat.com>

v1: https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00368.html

v2:
- Replace `asyncio.exceptions.TimeoutError` by `asyncio.TimeoutError`:
   Stefan reported that the CI does not recognize the former:
   https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00424.html

   As far as I understand, the latter was basically moved to become the
   former in Python 3.11, and an alias remains, so both are basically
   equivalent.  I only have 3.10, though, where the documentation says
   that both are different, even though using either seems to work fine
   (i.e. both catch the timeout there).  Not sure about previous
   versions, but the CI seems pretty certain about not knowing
   `asyncio.exceptions.TimeoutError`, so use `asyncio.TimeoutError`
   instead.  (Even though that is deprecated in 3.11, but this is not the
   first place in the tree to use it, so it should not be too bad.)
---
  .../qemu-iotests/tests/stream-under-throttle  | 121 ++
  .../tests/stream-under-throttle.out   |   5 +
  2 files changed, 126 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/stream-under-throttle
  create mode 100644 tests/qemu-iotests/tests/stream-under-throttle.out


As a heads up, I do get the following on s390. I have not yet looked into that:

+EE
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/tests/qemu-iotests/tests/stream-under-throttle", line 110, in 
test_stream
+    self.vm.run_job('stream')
+  File "qemu/tests/qemu-iotests/iotests.py", line 986, in run_job
+    result = self.qmp('query-jobs')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 533, in _soft_shutdown
+    self.qmp('quit')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+    ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+    return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+    self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+    raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 554, in _do_shutdown
+    self._soft_shutdown(timeout)
+  File "qemu/python/qemu/machine/machine.py", line 536, in _soft_shutdown
+    self._close_qmp_connection()
+  File "qemu/python/qemu/machine/machine.py", line 476, in 
_close_qmp_connection
+    self._qmp.close()
+  File "qemu/python/qemu/qmp/legacy.py", line 277, in close
+    self._sync(
+  File "qemu/python/qemu/qmp/legacy.py", line 94, in _sync
+    return self._aloop.run_until_complete(
+  File "/usr/lib64/python3.10/asyncio/base_events.py", line 649, in 
run_until_complete
+    return future.result()
+  File "/usr/lib64/python3.10/asyncio/tasks.py", line 408, in wait_for
+    return await fut
+  File "qemu/python/qemu/qmp/protocol.py", line 398, in disconnect
+    await self._wait_disconnect()
+  File "qemu/python/qemu/qmp/protocol.py", line 710, in _wait_disconnect
+    await all_defined_tasks  # Raise Exceptions from the bottom half.
+

Re: [PATCH v2] tests/stream-under-throttle: New test

2022-12-07 Thread Christian Borntraeger


Am 14.11.22 um 10:52 schrieb Hanna Reitz:

Test streaming a base image into the top image underneath two throttle
nodes.  This was reported to make qemu 7.1 hang
(https://gitlab.com/qemu-project/qemu/-/issues/1215), so this serves as
a regression test.

Signed-off-by: Hanna Reitz 
---
Based-on: <20221107151321.211175-1-hre...@redhat.com>

v1: https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00368.html

v2:
- Replace `asyncio.exceptions.TimeoutError` by `asyncio.TimeoutError`:
   Stefan reported that the CI does not recognize the former:
   https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00424.html

   As far as I understand, the latter was basically moved to become the
   former in Python 3.11, and an alias remains, so both are basically
   equivalent.  I only have 3.10, though, where the documentation says
   that both are different, even though using either seems to work fine
   (i.e. both catch the timeout there).  Not sure about previous
   versions, but the CI seems pretty certain about not knowing
   `asyncio.exceptions.TimeoutError`, so use `asyncio.TimeoutError`
   instead.  (Even though that is deprecated in 3.11, but this is not the
   first place in the tree to use it, so it should not be too bad.)
---
  .../qemu-iotests/tests/stream-under-throttle  | 121 ++
  .../tests/stream-under-throttle.out   |   5 +
  2 files changed, 126 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/stream-under-throttle
  create mode 100644 tests/qemu-iotests/tests/stream-under-throttle.out


As a heads up, I do get the following on s390. I have not yet looked into that:

+EE
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/tests/qemu-iotests/tests/stream-under-throttle", line 110, in 
test_stream
+self.vm.run_job('stream')
+  File "qemu/tests/qemu-iotests/iotests.py", line 986, in run_job
+result = self.qmp('query-jobs')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 533, in _soft_shutdown
+self.qmp('quit')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 554, in _do_shutdown
+self._soft_shutdown(timeout)
+  File "qemu/python/qemu/machine/machine.py", line 536, in _soft_shutdown
+self._close_qmp_connection()
+  File "qemu/python/qemu/machine/machine.py", line 476, in 
_close_qmp_connection
+self._qmp.close()
+  File "qemu/python/qemu/qmp/legacy.py", line 277, in close
+self._sync(
+  File "qemu/python/qemu/qmp/legacy.py", line 94, in _sync
+return self._aloop.run_until_complete(
+  File "/usr/lib64/python3.10/asyncio/base_events.py", line 649, in 
run_until_complete
+return future.result()
+  File "/usr/lib64/python3.10/asyncio/tasks.py", line 408, in wait_for
+return await fut
+  File "qemu/python/qemu/qmp/protocol.py", line 398, in disconnect
+await self._wait_disconnect()
+  File "qemu/python/qemu/qmp/protocol.py", line 710, in _wait_disconnect
+await all_defined_tasks  # Raise Exceptions from the bottom half.
+  File "qemu/python/qemu/qmp/protocol.py", line 861, in _bh_loop_forever
+await async_fn()
+  File "qemu/python/qemu/qmp/protocol.py", line 899, in _bh_recv_message
+msg = await self._recv()
+  File "qemu/python/qemu/qmp/protocol.py", line 1000, in _recv
+

Re: [PATCH v2] tests/stream-under-throttle: New test

2022-12-07 Thread Christian Borntraeger


Am 14.11.22 um 10:52 schrieb Hanna Reitz:

Test streaming a base image into the top image underneath two throttle
nodes.  This was reported to make qemu 7.1 hang
(https://gitlab.com/qemu-project/qemu/-/issues/1215), so this serves as
a regression test.

Signed-off-by: Hanna Reitz 
---
Based-on: <20221107151321.211175-1-hre...@redhat.com>

v1: https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00368.html

v2:
- Replace `asyncio.exceptions.TimeoutError` by `asyncio.TimeoutError`:
   Stefan reported that the CI does not recognize the former:
   https://lists.nongnu.org/archive/html/qemu-block/2022-11/msg00424.html

   As far as I understand, the latter was basically moved to become the
   former in Python 3.11, and an alias remains, so both are basically
   equivalent.  I only have 3.10, though, where the documentation says
   that both are different, even though using either seems to work fine
   (i.e. both catch the timeout there).  Not sure about previous
   versions, but the CI seems pretty certain about not knowing
   `asyncio.exceptions.TimeoutError`, so use `asyncio.TimeoutError`
   instead.  (Even though that is deprecated in 3.11, but this is not the
   first place in the tree to use it, so it should not be too bad.)
---
  .../qemu-iotests/tests/stream-under-throttle  | 121 ++
  .../tests/stream-under-throttle.out   |   5 +
  2 files changed, 126 insertions(+)
  create mode 100755 tests/qemu-iotests/tests/stream-under-throttle
  create mode 100644 tests/qemu-iotests/tests/stream-under-throttle.out


As a heads up, I do get the following on s390. I have not yet looked into that:

+EE
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/tests/qemu-iotests/tests/stream-under-throttle", line 110, in 
test_stream
+self.vm.run_job('stream')
+  File "qemu/tests/qemu-iotests/iotests.py", line 986, in run_job
+result = self.qmp('query-jobs')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+==
+ERROR: test_stream (__main__.TestStreamWithThrottle)
+Do a simple stream beneath the two throttle nodes.  Should complete
+--
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 533, in _soft_shutdown
+self.qmp('quit')
+  File "qemu/python/qemu/machine/machine.py", line 646, in qmp
+ret = self._qmp.cmd(cmd, args=qmp_args)
+  File "qemu/python/qemu/qmp/legacy.py", line 204, in cmd
+return self.cmd_obj(qmp_cmd)
+  File "qemu/python/qemu/qmp/legacy.py", line 184, in cmd_obj
+self._qmp._raw(qmp_cmd, assign_id=False),
+  File "qemu/python/qemu/qmp/protocol.py", line 154, in _wrapper
+raise StateError(emsg, proto.runstate, required_state)
+qemu.qmp.protocol.StateError: QMPClient is disconnecting. Call disconnect() to 
return to IDLE state.
+
+During handling of the above exception, another exception occurred:
+
+Traceback (most recent call last):
+  File "qemu/python/qemu/machine/machine.py", line 554, in _do_shutdown
+self._soft_shutdown(timeout)
+  File "qemu/python/qemu/machine/machine.py", line 536, in _soft_shutdown
+self._close_qmp_connection()
+  File "qemu/python/qemu/machine/machine.py", line 476, in 
_close_qmp_connection
+self._qmp.close()
+  File "qemu/python/qemu/qmp/legacy.py", line 277, in close
+self._sync(
+  File "qemu/python/qemu/qmp/legacy.py", line 94, in _sync
+return self._aloop.run_until_complete(
+  File "/usr/lib64/python3.10/asyncio/base_events.py", line 649, in 
run_until_complete
+return future.result()
+  File "/usr/lib64/python3.10/asyncio/tasks.py", line 408, in wait_for
+return await fut
+  File "qemu/python/qemu/qmp/protocol.py", line 398, in disconnect
+await self._wait_disconnect()
+  File "qemu/python/qemu/qmp/protocol.py", line 710, in _wait_disconnect
+await all_defined_tasks  # Raise Exceptions from the bottom half.
+  File "qemu/python/qemu/qmp/protocol.py", line 861, in _bh_loop_forever
+await async_fn()
+  File "qemu/python/qemu/qmp/protocol.py", line 899, in _bh_recv_message
+msg = await self._recv()
+  File "qemu/python/qemu/qmp/protocol.py", line 1000, in _recv
+

Re: qemu iotest 161 and make check

2022-12-05 Thread Christian Borntraeger





Am 27.10.22 um 07:54 schrieb Christian Borntraeger:
[...]

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..01bdb05575 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -388,7 +388,7 @@ _cleanup_qemu()
  kill -KILL ${QEMU_PID} 2>/dev/null
  fi
  if [ -n "${QEMU_PID}" ]; then
-    wait ${QEMU_PID} 2>/dev/null # silent kill
+    wait 2>/dev/null # silent kill
  fi
  fi


And this also helps. Still trying to find out what clone/fork happens here.


As a new information, the problem only exists on Ubuntu,
I cannot reproduce it with Fedora or RHEL. I also changed
the kernel, its not the reason. As soon as I add tracing
the different timing also makes the problem go away.

Re: qemu iotest 161 and make check

2022-12-05 Thread Christian Borntraeger





Am 27.10.22 um 07:54 schrieb Christian Borntraeger:
[...]

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..01bdb05575 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -388,7 +388,7 @@ _cleanup_qemu()
  kill -KILL ${QEMU_PID} 2>/dev/null
  fi
  if [ -n "${QEMU_PID}" ]; then
-    wait ${QEMU_PID} 2>/dev/null # silent kill
+    wait 2>/dev/null # silent kill
  fi
  fi


And this also helps. Still trying to find out what clone/fork happens here.


As a new information, the problem only exists on Ubuntu,
I cannot reproduce it with Fedora or RHEL. I also changed
the kernel, its not the reason. As soon as I add tracing
the different timing also makes the problem go away.

Re: [PATCH] virtiofsd: Add `sigreturn` to the seccomp whitelist

2022-11-29 Thread Christian Borntraeger





Am 29.11.22 um 10:42 schrieb Dr. David Alan Gilbert:

* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

"Dr. David Alan Gilbert"  writes:


* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

The virtiofsd currently crashes on s390x. This is because of a
`sigreturn` system call. See audit log below:

type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 ses=4294967295 
subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 
arch=8016 syscall=119 compat=0 ip=0x3fff15f748a code=0x8000AUID="unset" UID="root" 
GID="root" ARCH=s390x SYSCALL=sigreturn


I'm curious; doesn't that mean that some signal is being delivered and
you're returning?  Which one?


code=0x8000 means that the seccomp action SECCOMP_RET_KILL_PROCESS
is taken => process is killed by a SIGSYS signal (31) [1].

At least, that’s my understanding of this log message.

[1] https://man7.org/linux/man-pages/man2/seccomp.2.html


But isn't that the fallout rather than the cause ? i.e. seccomp
is sending a SIGSYS because the process used sigreturn, my question
is why did the process call sigreturn in the first place - it must
have received a signal to return from?


Good question. virtiofsd seems to prepare itself for

int fuse_set_signal_handlers(struct fuse_session *se)
{
/*
 * If we used SIG_IGN instead of the do_nothing function,
 * then we would be unable to tell if we set SIG_IGN (and
 * thus should reset to SIG_DFL in fuse_remove_signal_handlers)
 * or if it was already set to SIG_IGN (and should be left
 * untouched.
 */
if (set_one_signal_handler(SIGHUP, exit_handler, 0) == -1 ||
set_one_signal_handler(SIGINT, exit_handler, 0) == -1 ||
set_one_signal_handler(SIGTERM, exit_handler, 0) == -1 ||
set_one_signal_handler(SIGPIPE, do_nothing, 0) == -1) {
return -1;
}



Given that rt_sigreturn was already on the seccomp list it seems
to be expected that those handlers are called.

Re: [Virtio-fs] [PATCH] virtiofsd: Add `sigreturn` to the seccomp whitelist

2022-11-29 Thread Christian Borntraeger




Am 29.11.22 um 10:42 schrieb Dr. David Alan Gilbert:

* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

"Dr. David Alan Gilbert"  writes:


* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

The virtiofsd currently crashes on s390x. This is because of a
`sigreturn` system call. See audit log below:

type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 ses=4294967295 
subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 
arch=8016 syscall=119 compat=0 ip=0x3fff15f748a code=0x8000AUID="unset" UID="root" 
GID="root" ARCH=s390x SYSCALL=sigreturn


I'm curious; doesn't that mean that some signal is being delivered and
you're returning?  Which one?


code=0x8000 means that the seccomp action SECCOMP_RET_KILL_PROCESS
is taken => process is killed by a SIGSYS signal (31) [1].

At least, that’s my understanding of this log message.

[1] https://man7.org/linux/man-pages/man2/seccomp.2.html


But isn't that the fallout rather than the cause ? i.e. seccomp
is sending a SIGSYS because the process used sigreturn, my question
is why did the process call sigreturn in the first place - it must
have received a signal to return from?


Good question. virtiofsd seems to prepare itself for

int fuse_set_signal_handlers(struct fuse_session *se)
{
/*
 * If we used SIG_IGN instead of the do_nothing function,
 * then we would be unable to tell if we set SIG_IGN (and
 * thus should reset to SIG_DFL in fuse_remove_signal_handlers)
 * or if it was already set to SIG_IGN (and should be left
 * untouched.
 */
if (set_one_signal_handler(SIGHUP, exit_handler, 0) == -1 ||
set_one_signal_handler(SIGINT, exit_handler, 0) == -1 ||
set_one_signal_handler(SIGTERM, exit_handler, 0) == -1 ||
set_one_signal_handler(SIGPIPE, do_nothing, 0) == -1) {
return -1;
}



Given that rt_sigreturn was already on the seccomp list it seems
to be expected that those handlers are called.

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH] virtiofsd: Add `sigreturn` to the seccomp whitelist

2022-11-29 Thread Christian Borntraeger





Am 29.11.22 um 10:52 schrieb Christian Borntraeger:



Am 29.11.22 um 10:42 schrieb Dr. David Alan Gilbert:

* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

"Dr. David Alan Gilbert"  writes:


* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

The virtiofsd currently crashes on s390x. This is because of a
`sigreturn` system call. See audit log below:

type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 ses=4294967295 
subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 
arch=8016 syscall=119 compat=0 ip=0x3fff15f748a code=0x8000AUID="unset" UID="root" 
GID="root" ARCH=s390x SYSCALL=sigreturn


I'm curious; doesn't that mean that some signal is being delivered and
you're returning?  Which one?


code=0x8000 means that the seccomp action SECCOMP_RET_KILL_PROCESS
is taken => process is killed by a SIGSYS signal (31) [1].

At least, that’s my understanding of this log message.

[1] https://man7.org/linux/man-pages/man2/seccomp.2.html


But isn't that the fallout rather than the cause ? i.e. seccomp
is sending a SIGSYS because the process used sigreturn, my question
is why did the process call sigreturn in the first place - it must
have received a signal to return from?


Good question. virtiofsd seems to prepare itself for

int fuse_set_signal_handlers(struct fuse_session *se)
{
     /*
  * If we used SIG_IGN instead of the do_nothing function,
  * then we would be unable to tell if we set SIG_IGN (and
  * thus should reset to SIG_DFL in fuse_remove_signal_handlers)
  * or if it was already set to SIG_IGN (and should be left
  * untouched.
  */
     if (set_one_signal_handler(SIGHUP, exit_handler, 0) == -1 ||
     set_one_signal_handler(SIGINT, exit_handler, 0) == -1 ||
     set_one_signal_handler(SIGTERM, exit_handler, 0) == -1 ||
     set_one_signal_handler(SIGPIPE, do_nothing, 0) == -1) {
     return -1;
     }



Given that rt_sigreturn was already on the seccomp list it seems
to be expected that those handlers are called.


For me, it seems to happen on shutdown:
Stack trace of thread 1:
#0  0x03ffc06f348a __kernel_sigreturn (linux-vdso64.so.1 + 
0x48a)
#1  0x03ffc06f3488 __kernel_sigreturn (linux-vdso64.so.1 + 
0x488)
#2  0x03ff9af1be96 __GI___futex_abstimed_wait_cancelable64 
(libc.so.6 + 0x9be96)
#3  0x03ff9af211b4 __pthread_clockjoin_ex (libc.so.6 + 
0xa11b4)
#4  0x03ff9af2106e pthread_join@GLIBC_2.2 (libc.so.6 + 
0xa106e)
#5  0x02aa35d2fe36 fv_queue_cleanup_thread (virtiofsd + 
0x2fe36)
#6  0x02aa35d3152c stop_all_queues (virtiofsd + 0x3152c)
#7  0x02aa35d2869c main (virtiofsd + 0x2869c)
#8  0x03ff9aeb4872 __libc_start_call_main (libc.so.6 + 
0x34872)
#9  0x03ff9aeb4950 __libc_start_main@@GLIBC_2.34 (libc.so.6 
+ 0x34950)
#10 0x02aa35d290a0 .annobin_libvhost_user.c_end.startup 
(virtiofsd + 0x290a0)

Re: [Virtio-fs] [PATCH] virtiofsd: Add `sigreturn` to the seccomp whitelist

2022-11-29 Thread Christian Borntraeger




Am 29.11.22 um 10:52 schrieb Christian Borntraeger:



Am 29.11.22 um 10:42 schrieb Dr. David Alan Gilbert:

* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

"Dr. David Alan Gilbert"  writes:


* Marc Hartmayer (mhart...@linux.ibm.com) wrote:

The virtiofsd currently crashes on s390x. This is because of a
`sigreturn` system call. See audit log below:

type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 ses=4294967295 
subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 
arch=8016 syscall=119 compat=0 ip=0x3fff15f748a code=0x8000AUID="unset" UID="root" 
GID="root" ARCH=s390x SYSCALL=sigreturn


I'm curious; doesn't that mean that some signal is being delivered and
you're returning?  Which one?


code=0x8000 means that the seccomp action SECCOMP_RET_KILL_PROCESS
is taken => process is killed by a SIGSYS signal (31) [1].

At least, that’s my understanding of this log message.

[1] https://man7.org/linux/man-pages/man2/seccomp.2.html


But isn't that the fallout rather than the cause ? i.e. seccomp
is sending a SIGSYS because the process used sigreturn, my question
is why did the process call sigreturn in the first place - it must
have received a signal to return from?


Good question. virtiofsd seems to prepare itself for

int fuse_set_signal_handlers(struct fuse_session *se)
{
     /*
  * If we used SIG_IGN instead of the do_nothing function,
  * then we would be unable to tell if we set SIG_IGN (and
  * thus should reset to SIG_DFL in fuse_remove_signal_handlers)
  * or if it was already set to SIG_IGN (and should be left
  * untouched.
  */
     if (set_one_signal_handler(SIGHUP, exit_handler, 0) == -1 ||
     set_one_signal_handler(SIGINT, exit_handler, 0) == -1 ||
     set_one_signal_handler(SIGTERM, exit_handler, 0) == -1 ||
     set_one_signal_handler(SIGPIPE, do_nothing, 0) == -1) {
     return -1;
     }



Given that rt_sigreturn was already on the seccomp list it seems
to be expected that those handlers are called.


For me, it seems to happen on shutdown:
Stack trace of thread 1:
#0  0x03ffc06f348a __kernel_sigreturn (linux-vdso64.so.1 + 
0x48a)
#1  0x03ffc06f3488 __kernel_sigreturn (linux-vdso64.so.1 + 
0x488)
#2  0x03ff9af1be96 __GI___futex_abstimed_wait_cancelable64 
(libc.so.6 + 0x9be96)
#3  0x03ff9af211b4 __pthread_clockjoin_ex (libc.so.6 + 
0xa11b4)
#4  0x03ff9af2106e pthread_join@GLIBC_2.2 (libc.so.6 + 
0xa106e)
#5  0x02aa35d2fe36 fv_queue_cleanup_thread (virtiofsd + 
0x2fe36)
#6  0x02aa35d3152c stop_all_queues (virtiofsd + 0x3152c)
#7  0x02aa35d2869c main (virtiofsd + 0x2869c)
#8  0x03ff9aeb4872 __libc_start_call_main (libc.so.6 + 
0x34872)
#9  0x03ff9aeb4950 __libc_start_main@@GLIBC_2.34 (libc.so.6 
+ 0x34950)
#10 0x02aa35d290a0 .annobin_libvhost_user.c_end.startup 
(virtiofsd + 0x290a0)


___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [Virtio-fs] [PATCH] virtiofsd: Add `sigreturn` to the seccomp whitelist

2022-11-27 Thread Christian Borntraeger




Am 25.11.22 um 17:32 schrieb German Maglione:

On Fri, Nov 25, 2022 at 3:40 PM Marc Hartmayer  wrote:


The virtiofsd currently crashes on s390x. This is because of a
`sigreturn` system call. See audit log below:

type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 ses=4294967295 
subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 
arch=8016 syscall=119 compat=0 ip=0x3fff15f748a code=0x8000AUID="unset" UID="root" 
GID="root" ARCH=s390x SYSCALL=sigreturn

Signed-off-by: Marc Hartmayer 
---
  tools/virtiofsd/passthrough_seccomp.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tools/virtiofsd/passthrough_seccomp.c 
b/tools/virtiofsd/passthrough_seccomp.c
index 888295c073de..0033dab4939e 100644
--- a/tools/virtiofsd/passthrough_seccomp.c
+++ b/tools/virtiofsd/passthrough_seccomp.c
@@ -110,6 +110,7 @@ static const int syscall_allowlist[] = {
  #endif
  SCMP_SYS(set_robust_list),
  SCMP_SYS(setxattr),
+SCMP_SYS(sigreturn),
  SCMP_SYS(symlinkat),
  SCMP_SYS(syncfs),
  SCMP_SYS(time), /* Rarely needed, except on static builds */
--
2.34.1

___
Virtio-fs mailing list
virtio...@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs



Reviewed-by:  German Maglione 

Should we add this also in the rust version?, I see we don't have it
enabled either.


this is probably a good idea.

Re: [Virtio-fs] [PATCH] virtiofsd: Add `sigreturn` to the seccomp whitelist

2022-11-27 Thread Christian Borntraeger




Am 25.11.22 um 17:32 schrieb German Maglione:

On Fri, Nov 25, 2022 at 3:40 PM Marc Hartmayer  wrote:


The virtiofsd currently crashes on s390x. This is because of a
`sigreturn` system call. See audit log below:

type=SECCOMP msg=audit(1669382477.611:459): auid=4294967295 uid=0 gid=0 ses=4294967295 
subj=system_u:system_r:virtd_t:s0-s0:c0.c1023 pid=6649 comm="virtiofsd" exe="/usr/libexec/virtiofsd" sig=31 
arch=8016 syscall=119 compat=0 ip=0x3fff15f748a code=0x8000AUID="unset" UID="root" 
GID="root" ARCH=s390x SYSCALL=sigreturn

Signed-off-by: Marc Hartmayer 
---
  tools/virtiofsd/passthrough_seccomp.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tools/virtiofsd/passthrough_seccomp.c 
b/tools/virtiofsd/passthrough_seccomp.c
index 888295c073de..0033dab4939e 100644
--- a/tools/virtiofsd/passthrough_seccomp.c
+++ b/tools/virtiofsd/passthrough_seccomp.c
@@ -110,6 +110,7 @@ static const int syscall_allowlist[] = {
  #endif
  SCMP_SYS(set_robust_list),
  SCMP_SYS(setxattr),
+SCMP_SYS(sigreturn),
  SCMP_SYS(symlinkat),
  SCMP_SYS(syncfs),
  SCMP_SYS(time), /* Rarely needed, except on static builds */
--
2.34.1

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs



Reviewed-by:  German Maglione 

Should we add this also in the rust version?, I see we don't have it
enabled either.


this is probably a good idea.

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-22 Thread Christian Borntraeger


Am 21.11.22 um 23:37 schrieb Michael S. Tsirkin:
[...]

qemu-system-x86_64: ../hw/virtio/vhost-vsock-common.c:203: 
vhost_vsock_common_pre_save: Assertion `!vhost_dev_is_started(>vhost_dev)' 
failed.
2022-11-15 16:38:46.096+: shutting down, reason=crashed


Alex were you able to replicate? Just curious.


Ping?

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-22 Thread Christian Borntraeger


Am 21.11.22 um 23:37 schrieb Michael S. Tsirkin:
[...]

qemu-system-x86_64: ../hw/virtio/vhost-vsock-common.c:203: 
vhost_vsock_common_pre_save: Assertion `!vhost_dev_is_started(>vhost_dev)' 
failed.
2022-11-15 16:38:46.096+: shutting down, reason=crashed


Alex were you able to replicate? Just curious.


Ping?

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-22 Thread Christian Borntraeger


Am 21.11.22 um 23:37 schrieb Michael S. Tsirkin:
[...]

qemu-system-x86_64: ../hw/virtio/vhost-vsock-common.c:203: 
vhost_vsock_common_pre_save: Assertion `!vhost_dev_is_started(>vhost_dev)' 
failed.
2022-11-15 16:38:46.096+: shutting down, reason=crashed


Alex were you able to replicate? Just curious.


Ping?

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH v9 00/10] s390x: CPU Topology

2022-11-16 Thread Christian Borntraeger


Am 02.09.22 um 09:55 schrieb Pierre Morel:

Hi,

The implementation of the CPU Topology in QEMU has been drastically
modified since the last patch series and the number of LOCs has been
greatly reduced.

Unnecessary objects have been removed, only a single S390Topology object
is created to support migration and reset.

Also a documentation has been added to the series.


To use these patches, you will need Linux V6-rc1 or newer.

Mainline patches needed are:

f5ecfee94493 2022-07-20 KVM: s390: resetting the Topology-Change-Report
24fe0195bc19 2022-07-20 KVM: s390: guest support for topology function
0130337ec45b 2022-07-20 KVM: s390: Cleanup ipte lock access and SIIF fac..

Currently this code is for KVM only, I have no idea if it is interesting
to provide a TCG patch. If ever it will be done in another series.

To have a better understanding of the S390x CPU Topology and its
implementation in QEMU you can have a look at the documentation in the
last patch.

New in this series
==

   s390x/cpus: Make absence of multithreading clear

This patch makes clear that CPU-multithreading is not supported in
the guest.

   s390x/cpu topology: core_id sets s390x CPU topology

This patch uses the core_id to build the container topology
and the placement of the CPU inside the container.

   s390x/cpu topology: reporting the CPU topology to the guest

This patch is based on the fact that the CPU type for guests
is always IFL, CPUs are always dedicated and the polarity is
always horizontal.
This may change in the future.

   hw/core: introducing drawer and books for s390x
   s390x/cpu: reporting drawers and books topology to the guest

These two patches extend the topology handling to add two
new containers levels above sockets: books and drawers.

The subject of the last patches is clear enough (I hope).

Regards,
Pierre

Pierre Morel (10):
   s390x/cpus: Make absence of multithreading clear
   s390x/cpu topology: core_id sets s390x CPU topology
   s390x/cpu topology: reporting the CPU topology to the guest
   hw/core: introducing drawer and books for s390x
   s390x/cpu: reporting drawers and books topology to the guest
   s390x/cpu_topology: resetting the Topology-Change-Report
   s390x/cpu_topology: CPU topology migration
   target/s390x: interception of PTF instruction
   s390x/cpu_topology: activating CPU topology



Do we really need a machine property? As far as I can see, old QEMU
cannot  activate the ctop facility with old and new kernel unless it
enables CAP_S390_CPU_TOPOLOGY. I do get
oldqemu  -cpu z14,ctop=on
qemu-system-s390x: Some features requested in the CPU model are not available 
in the configuration: ctop

With the newer QEMU we can. So maybe we can simply have a topology (and
then a cpu model feature) in new QEMUs and non in old. the cpu model
would then also fence migration from enabled to disabled.

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 17:40 schrieb Christian Borntraeger:



Am 15.11.22 um 17:05 schrieb Alex Bennée:


Christian Borntraeger  writes:


Am 15.11.22 um 15:31 schrieb Alex Bennée:

"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

    hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.

I need to replicate the failing test for that. Which test is
failing?



Pretty much the same as before. guest with vsock, managedsave and
restore.


If this isn't in our test suite I'm going to need exact steps.


Just get any libvirt guest, add
     
   
     

to your libvirt xml. Start the guest (with the new xml).
Run virsh managedsave - qemu crashes. On x86 and s390.



the libvirt log:

/home/cborntra/REPOS/qemu/build/x86_64-softmmu/qemu-system-x86_64 \
-name guest=f36,debug-threads=on \
-S \
-object 
'{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-f36/master-key.aes"}'
 \
-machine pc-i440fx-7.2,usb=off,dump-guest-core=off,memory-backend=pc.ra

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 17:40 schrieb Christian Borntraeger:



Am 15.11.22 um 17:05 schrieb Alex Bennée:


Christian Borntraeger  writes:


Am 15.11.22 um 15:31 schrieb Alex Bennée:

"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

    hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.

I need to replicate the failing test for that. Which test is
failing?



Pretty much the same as before. guest with vsock, managedsave and
restore.


If this isn't in our test suite I'm going to need exact steps.


Just get any libvirt guest, add
     
   
     

to your libvirt xml. Start the guest (with the new xml).
Run virsh managedsave - qemu crashes. On x86 and s390.



the libvirt log:

/home/cborntra/REPOS/qemu/build/x86_64-softmmu/qemu-system-x86_64 \
-name guest=f36,debug-threads=on \
-S \
-object 
'{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-f36/master-key.aes"}'
 \
-machine pc-i440fx-7.2,usb=off,dump-guest-core=off,memory-backend=pc.ra

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger




Am 15.11.22 um 17:40 schrieb Christian Borntraeger:



Am 15.11.22 um 17:05 schrieb Alex Bennée:


Christian Borntraeger  writes:


Am 15.11.22 um 15:31 schrieb Alex Bennée:

"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

    hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.

I need to replicate the failing test for that. Which test is
failing?



Pretty much the same as before. guest with vsock, managedsave and
restore.


If this isn't in our test suite I'm going to need exact steps.


Just get any libvirt guest, add
     
   
     

to your libvirt xml. Start the guest (with the new xml).
Run virsh managedsave - qemu crashes. On x86 and s390.



the libvirt log:

/home/cborntra/REPOS/qemu/build/x86_64-softmmu/qemu-system-x86_64 \
-name guest=f36,debug-threads=on \
-S \
-object 
'{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-f36/master-key.aes"}'
 \
-machine pc-i440fx-7.2,usb=off,dump-guest-core=off,memory-backend=pc.ra

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 17:05 schrieb Alex Bennée:


Christian Borntraeger  writes:


Am 15.11.22 um 15:31 schrieb Alex Bennée:

"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.

I need to replicate the failing test for that. Which test is
failing?



Pretty much the same as before. guest with vsock, managedsave and
restore.


If this isn't in our test suite I'm going to need exact steps.


Just get any libvirt guest, add

  


to your libvirt xml. Start the guest (with the new xml).
Run virsh managedsave - qemu crashes. On x86 and s390.

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 17:05 schrieb Alex Bennée:


Christian Borntraeger  writes:


Am 15.11.22 um 15:31 schrieb Alex Bennée:

"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.

I need to replicate the failing test for that. Which test is
failing?



Pretty much the same as before. guest with vsock, managedsave and
restore.


If this isn't in our test suite I'm going to need exact steps.


Just get any libvirt guest, add

  


to your libvirt xml. Start the guest (with the new xml).
Run virsh managedsave - qemu crashes. On x86 and s390.

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger




Am 15.11.22 um 17:05 schrieb Alex Bennée:


Christian Borntraeger  writes:


Am 15.11.22 um 15:31 schrieb Alex Bennée:

"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.

I need to replicate the failing test for that. Which test is
failing?



Pretty much the same as before. guest with vsock, managedsave and
restore.


If this isn't in our test suite I'm going to need exact steps.


Just get any libvirt guest, add

  


to your libvirt xml. Start the guest (with the new xml).
Run virsh managedsave - qemu crashes. On x86 and s390.

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 15:31 schrieb Alex Bennée:


"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

   hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
  (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
  at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.


I need to replicate the failing test for that. Which test is failing?



Pretty much the same as before. guest with vsock, managedsave and restore.

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 15:31 schrieb Alex Bennée:


"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

   hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
  (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
  at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.


I need to replicate the failing test for that. Which test is failing?



Pretty much the same as before. guest with vsock, managedsave and restore.

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger




Am 15.11.22 um 15:31 schrieb Alex Bennée:


"Michael S. Tsirkin"  writes:


On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

   hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5 0x02aa1ffa8966 in vhost_vsock_common_pre_save
(opaque=) at
../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
  (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0
, opaque=0x2aa21bac9f8,
vmdesc=vmdesc@entry=0x3fddc08eb30,
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7 0x02aa1fe5ebf8 in vmstate_save_state
(f=f@entry=0x2aa21bdc170, vmsd=,
opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30)
at ../migration/vmstate.c:317
#8 0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170,
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at
../migration/savevm.c:908
#9 0x02aa1fe79584 in
qemu_savevm_state_complete_precopy_non_iterable
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false,
inactivate_disks=inactivate_disks@entry=true)
  at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy
(f=0x2aa21bdc170, iterable_only=iterable_only@entry=false,
inactivate_disks=inactivate_disks@entry=true) at
../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.


I need to replicate the failing test for that. Which test is failing?



Pretty much the same as before. guest with vsock, managedsave and restore.

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 12:25 schrieb Michael S. Tsirkin:

On Tue, Nov 15, 2022 at 09:18:27AM +0100, Christian Borntraeger wrote:


Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 
, opaque=0x2aa21bac9f8, 
vmdesc=vmdesc@entry=0x3fddc08eb30, version_id=version_id@entry=0) at 
../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.



The initial fix changed to qemu/master does still work for me

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a973811cbfc6..fb3072838119 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -411,14 +411,14 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
   */
  static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
  {
-if (vdev->use_started) {
-return vdev->started;
-}
-
  if (!vdev->vm_running) {
  return false;
  }
+if (vdev->use_started) {
+return vdev->started;
+}
+
  return status & VIRTIO_CONFIG_S_DRIVER_OK;
  }


Triggers failure on gitlab unfortunately:

https://gitlab.com/mstredhat/qemu/-/jobs/3323768122


S

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger





Am 15.11.22 um 12:25 schrieb Michael S. Tsirkin:

On Tue, Nov 15, 2022 at 09:18:27AM +0100, Christian Borntraeger wrote:


Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 
, opaque=0x2aa21bac9f8, 
vmdesc=vmdesc@entry=0x3fddc08eb30, version_id=version_id@entry=0) at 
../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.



The initial fix changed to qemu/master does still work for me

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a973811cbfc6..fb3072838119 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -411,14 +411,14 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
   */
  static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
  {
-if (vdev->use_started) {
-return vdev->started;
-}
-
  if (!vdev->vm_running) {
  return false;
  }
+if (vdev->use_started) {
+return vdev->started;
+}
+
  return status & VIRTIO_CONFIG_S_DRIVER_OK;
  }


Triggers failure on gitlab unfortunately:

https://gitlab.com/mstredhat/qemu/-/jobs/3323768122


S

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger




Am 15.11.22 um 12:25 schrieb Michael S. Tsirkin:

On Tue, Nov 15, 2022 at 09:18:27AM +0100, Christian Borntraeger wrote:


Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
   (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 
, opaque=0x2aa21bac9f8, 
vmdesc=vmdesc@entry=0x3fddc08eb30, version_id=version_id@entry=0) at 
../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
   at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.



The initial fix changed to qemu/master does still work for me

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a973811cbfc6..fb3072838119 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -411,14 +411,14 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
   */
  static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
  {
-if (vdev->use_started) {
-return vdev->started;
-}
-
  if (!vdev->vm_running) {
  return false;
  }
+if (vdev->use_started) {
+return vdev->started;
+}
+
  return status & VIRTIO_CONFIG_S_DRIVER_OK;
  }


Triggers failure on gitlab unfortunately:

https://gitlab.com/mstredhat/qemu/-/jobs/3323768122


S

Re: zLinux 3270 console support

2022-11-15 Thread Christian Borntraeger


Am 14.11.22 um 22:11 schrieb Alan Altmark:

Sven asked about this on an internal IBM discussion channel and I mentioned 
that there are two issues that I can see in the Linux 3270 support:

1.  Linux is definitely doing it wrong on z/VM.  Linux infers the dimensions of 
the screen from a model number.  (Bzzt!)   If you have a standard-size 
screen for model 2, 3, 4, or 5, then the model number from DIAG 0x210 will be 
correct.  Otherwise it will reflect model 2.  I'm not sure why CP does that or 
why Linux is using the model number at all.   Even at the turn of the century 
it was irrelevant.

2. Linux doesn't detect a reconnect event, so it doesn't know to re-query the 
device and adjust any 'views' that have been created.



To be honest while someone could invest time in here, I think the proper 
solution for Linux would be to improve z/VMs ASCII console support (e.g. make 
it virtual instead of attaching the LPAR one and provide a console server).

--
For LINUX-390 subscribe / signoff / archive access instructions,
send email to lists...@vm.marist.edu with the message: INFO LINUX-390 or visit
http://www2.marist.edu/htbin/wlvindex?LINUX-390

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger




Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

   hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
  (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 
, opaque=0x2aa21bac9f8, 
vmdesc=vmdesc@entry=0x3fddc08eb30, version_id=version_id@entry=0) at 
../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
  at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.



The initial fix changed to qemu/master does still work for me

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a973811cbfc6..fb3072838119 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -411,14 +411,14 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
  */
 static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
 {
-if (vdev->use_started) {
-return vdev->started;
-}
-
 if (!vdev->vm_running) {
 return false;
 }
 
+if (vdev->use_started) {

+return vdev->started;
+}
+
 return status & VIRTIO_CONFIG_S_DRIVER_OK;
 }

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger




Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

   hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
  (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 
, opaque=0x2aa21bac9f8, 
vmdesc=vmdesc@entry=0x3fddc08eb30, version_id=version_id@entry=0) at 
../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
  at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.



The initial fix changed to qemu/master does still work for me

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a973811cbfc6..fb3072838119 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -411,14 +411,14 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
  */
 static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
 {
-if (vdev->use_started) {
-return vdev->started;
-}
-
 if (!vdev->vm_running) {
 return false;
 }
 
+if (vdev->use_started) {

+return vdev->started;
+}
+
 return status & VIRTIO_CONFIG_S_DRIVER_OK;
 }

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-15 Thread Christian Borntraeger



Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 06:15:30PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

   hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
  (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 
, opaque=0x2aa21bac9f8, 
vmdesc=vmdesc@entry=0x3fddc08eb30, version_id=version_id@entry=0) at 
../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
  at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?


I don't remember at this point.


So how do we proceed now?


I'm hopeful Alex will come up with a fix.



The initial fix changed to qemu/master does still work for me

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index a973811cbfc6..fb3072838119 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -411,14 +411,14 @@ static inline bool virtio_device_started(VirtIODevice 
*vdev, uint8_t status)
  */
 static inline bool virtio_device_should_start(VirtIODevice *vdev, uint8_t 
status)
 {
-if (vdev->use_started) {
-return vdev->started;
-}
-
 if (!vdev->vm_running) {
 return false;
 }
 
+if (vdev->use_started) {

+return vdev->started;
+}
+
 return status & VIRTIO_CONFIG_S_DRIVER_OK;
 }
 


___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger





Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:
[...]




This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?


Just dobble checked,

9f6bcfd99f was the patch that created the original problem, no?

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger





Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:
[...]




This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?


Just dobble checked,

9f6bcfd99f was the patch that created the original problem, no?

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger





Am 14.11.22 um 18:20 schrieb Michael S. Tsirkin:
[...]




This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?


Just dobble checked,

9f6bcfd99f was the patch that created the original problem, no?

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger


Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?

This does not seem to fix the regression that I have reported.

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger





Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

  hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
 (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 , 
opaque=0x2aa21bac9f8, vmdesc=vmdesc@entry=0x3fddc08eb30, 
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
 at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?
So how do we proceed now?

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger





Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

 hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
(f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 , 
opaque=0x2aa21bac9f8, vmdesc=vmdesc@entry=0x3fddc08eb30, 
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger





Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

 hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
(f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 , 
opaque=0x2aa21bac9f8, vmdesc=vmdesc@entry=0x3fddc08eb30, 
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger





Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

  hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
 (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 , 
opaque=0x2aa21bac9f8, vmdesc=vmdesc@entry=0x3fddc08eb30, 
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
 at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?
So how do we proceed now?

Re: [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger


Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?

This does not seem to fix the regression that I have reported.

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger




Am 14.11.22 um 18:10 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:55:09PM +0100, Christian Borntraeger wrote:



Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

  hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
 (f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 , 
opaque=0x2aa21bac9f8, vmdesc=vmdesc@entry=0x3fddc08eb30, 
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
 at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.


That one was failing under github CI though (for reasons we didn't
really address, such as disconnect during stop causing a recursive
call to stop, but there you are).

Even the double revert of everything?
So how do we proceed now?

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger




Am 14.11.22 um 17:37 schrieb Michael S. Tsirkin:

On Mon, Nov 14, 2022 at 05:18:53PM +0100, Christian Borntraeger wrote:

Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

 hw/virtio: introduce virtio_device_should_start

and older version?


This is what got merged:
https://lore.kernel.org/r/20221107121407.1010913-1-alex.bennee%40linaro.org
This patch was sent after I merged the RFC.
I think the only difference is the commit log but I might be missing
something.


This does not seem to fix the regression that I have reported.


This was applied on top of 9f6bcfd99f which IIUC does, right?




QEMU master still fails for me for suspend/resume to disk:

#0  0x03ff8e3980a6 in __pthread_kill_implementation () at /lib64/libc.so.6
#1  0x03ff8e348580 in raise () at /lib64/libc.so.6
#2  0x03ff8e32b5c0 in abort () at /lib64/libc.so.6
#3  0x03ff8e3409da in __assert_fail_base () at /lib64/libc.so.6
#4  0x03ff8e340a4e in  () at /lib64/libc.so.6
#5  0x02aa1ffa8966 in vhost_vsock_common_pre_save (opaque=) 
at ../hw/virtio/vhost-vsock-common.c:203
#6  0x02aa1fe5e0ee in vmstate_save_state_v
(f=f@entry=0x2aa21bdc170, vmsd=0x2aa204ac5f0 , 
opaque=0x2aa21bac9f8, vmdesc=vmdesc@entry=0x3fddc08eb30, 
version_id=version_id@entry=0) at ../migration/vmstate.c:329
#7  0x02aa1fe5ebf8 in vmstate_save_state (f=f@entry=0x2aa21bdc170, vmsd=, opaque=, vmdesc_id=vmdesc_id@entry=0x3fddc08eb30) at 
../migration/vmstate.c:317
#8  0x02aa1fe75bd0 in vmstate_save (f=f@entry=0x2aa21bdc170, 
se=se@entry=0x2aa21bdbe90, vmdesc=vmdesc@entry=0x3fddc08eb30) at 
../migration/savevm.c:908
#9  0x02aa1fe79584 in qemu_savevm_state_complete_precopy_non_iterable 
(f=f@entry=0x2aa21bdc170, in_postcopy=in_postcopy@entry=false, 
inactivate_disks=inactivate_disks@entry=true)
at ../migration/savevm.c:1393
#10 0x02aa1fe79a96 in qemu_savevm_state_complete_precopy (f=0x2aa21bdc170, 
iterable_only=iterable_only@entry=false, 
inactivate_disks=inactivate_disks@entry=true) at ../migration/savevm.c:1459
#11 0x02aa1fe6d6ee in migration_completion (s=0x2aa218ef600) at 
../migration/migration.c:3314
#12 migration_iteration_run (s=0x2aa218ef600) at ../migration/migration.c:3761
#13 migration_thread (opaque=opaque@entry=0x2aa218ef600) at 
../migration/migration.c:3989
#14 0x02aa201f0b8c in qemu_thread_start (args=) at 
../util/qemu-thread-posix.c:505
#15 0x03ff8e396248 in start_thread () at /lib64/libc.so.6
#16 0x03ff8e41183e in thread_start () at /lib64/libc.so.6

Michael, your previous branch did work if I recall correctly.

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [Virtio-fs] [PATCH v1 5/9] hw/virtio: introduce virtio_device_should_start

2022-11-14 Thread Christian Borntraeger


Am 08.11.22 um 10:23 schrieb Alex Bennée:

The previous fix to virtio_device_started revealed a problem in its
use by both the core and the device code. The core code should be able
to handle the device "starting" while the VM isn't running to handle
the restoration of migration state. To solve this dual use introduce a
new helper for use by the vhost-user backends who all use it to feed a
should_start variable.

We can also pick up a change vhost_user_blk_set_status while we are at
it which follows the same pattern.

Fixes: 9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)
Fixes: 27ba7b027f (hw/virtio: add boilerplate for vhost-user-gpio device)
Signed-off-by: Alex Bennée 
Cc: "Michael S. Tsirkin" 


Hmmm, is this
commit 259d69c00b67c02a67f3bdbeeea71c2c0af76c35
Author: Alex Bennée 
AuthorDate: Mon Nov 7 12:14:07 2022 +
Commit: Michael S. Tsirkin 
CommitDate: Mon Nov 7 14:08:18 2022 -0500

hw/virtio: introduce virtio_device_should_start

and older version?

This does not seem to fix the regression that I have reported.

___
Virtio-fs mailing list
Virtio-fs@redhat.com
https://listman.redhat.com/mailman/listinfo/virtio-fs

Re: [PATCH/RFC] virtio_test: A module for testing virtio via userspace

2022-11-14 Thread Christian Borntraeger





Am 12.11.22 um 17:19 schrieb Dmitry Vyukov:

Hi,

The original email is from 2009, so I assume you don't have it in
your inboxes already. Here is the original email:

https://lore.kernel.org/all/200906190927.34831.borntrae...@de.ibm.com/


This patch introduces a prototype for a virtio_test module. This module can
be bound to any virtio device via sysfs bind/unbind feature, e.g:
$ echo virtio1 > /sys/bus/virtio/drivers/virtio_rng/unbind
$ modprobe virtio_test

On probe this module registers to all virtqueues and creates a character
device for every virtio device. (/dev/viotest).
The character device offers ioctls to allow a userspace application to submit
virtio operations like addbuf, kick and getbuf. It also offers ioctls to get
information about the device and to query the amount of occurred callbacks (or
wait synchronously on callbacks).


As far as I understand the test driver was never merged and I can't find
any similar testing drivers. I am looking for a test module that allows
to create a transient virtio device that can be used to activate a virtio
driver are communicate with it as if from the host.

Does such thing exist already?
Or how are virtio transports/drivers tested/fuzzed nowadays?



Right, the driver was never merged.
Adding Michael as todays virtio maintainer for ideas how to proceed.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] virtio: re-order vm_running and use_started checks

2022-11-04 Thread Christian Borntraeger


Am 04.11.22 um 17:51 schrieb Christian Borntraeger:



Am 04.11.22 um 17:14 schrieb Michael S. Tsirkin:

On Fri, Nov 04, 2022 at 04:59:35PM +0100, Christian Borntraeger wrote:



Am 04.11.22 um 16:56 schrieb Michael S. Tsirkin:

On Fri, Oct 14, 2022 at 02:21:08PM +0100, Alex Bennée wrote:

During migration the virtio device state can be restored before we
restart the VM. As no devices can be running while the VM is paused it
makes sense to bail out early in that case.

This returns the order introduced in:

   9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)

to what virtio-sock was doing longhand.

Signed-off-by: Alex Bennée 
Cc: Christian Borntraeger 



What happens now:

with this applied I get:

https://gitlab.com/mitsirkin/qemu/-/pipelines/685829158/failures

― ✀  ―
stderr:
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-reconnect,path=/tmp/vhost-test-QLKXU1/reconnect.sock,server=on: 
info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-QLKXU1/reconnect.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-connect-fail,path=/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: Failed to read msg 
header. Read 0 instead of 12. Original request 1.
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: vhost_backend_init 
failed: Protocol error
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: failed to init 
vhost_net for queue 0
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: info: QEMU waiting 
for connection on: 
disconnected:unix:/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-flags-mismatch,path=/tmp/vhost-test-3MO5U1/flags-mismatch.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-3MO5U1/flags-mismatch.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 52.
qemu-system-arm: vhost_set_mem_table failed: Invalid argument (22)
qemu-system-arm: unable to start vhost net: 22: falling back on userspace virtio
vhost lacks feature mask 0x4000 for backend
qemu-system-arm: failed to init vhost_net for queue 0
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 2 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 3 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost_set_vring_call failed: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost_set_vring_call failed: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed

Re: [RFC PATCH] virtio: re-order vm_running and use_started checks

2022-11-04 Thread Christian Borntraeger





Am 04.11.22 um 17:14 schrieb Michael S. Tsirkin:

On Fri, Nov 04, 2022 at 04:59:35PM +0100, Christian Borntraeger wrote:



Am 04.11.22 um 16:56 schrieb Michael S. Tsirkin:

On Fri, Oct 14, 2022 at 02:21:08PM +0100, Alex Bennée wrote:

During migration the virtio device state can be restored before we
restart the VM. As no devices can be running while the VM is paused it
makes sense to bail out early in that case.

This returns the order introduced in:

   9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)

to what virtio-sock was doing longhand.

Signed-off-by: Alex Bennée 
Cc: Christian Borntraeger 



What happens now:

with this applied I get:

https://gitlab.com/mitsirkin/qemu/-/pipelines/685829158/failures

― ✀  ―
stderr:
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-reconnect,path=/tmp/vhost-test-QLKXU1/reconnect.sock,server=on: 
info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-QLKXU1/reconnect.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-connect-fail,path=/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: Failed to read msg 
header. Read 0 instead of 12. Original request 1.
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: vhost_backend_init 
failed: Protocol error
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: failed to init 
vhost_net for queue 0
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: info: QEMU waiting 
for connection on: 
disconnected:unix:/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-flags-mismatch,path=/tmp/vhost-test-3MO5U1/flags-mismatch.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-3MO5U1/flags-mismatch.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 52.
qemu-system-arm: vhost_set_mem_table failed: Invalid argument (22)
qemu-system-arm: unable to start vhost net: 22: falling back on userspace virtio
vhost lacks feature mask 0x4000 for backend
qemu-system-arm: failed to init vhost_net for queue 0
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 2 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 3 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost_set_vring_call failed: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost_set_vring_call failed: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -5: Input/output error (5)
qemu-system-arm: ../hw

Re: [RFC PATCH] virtio: re-order vm_running and use_started checks

2022-11-04 Thread Christian Borntraeger





Am 04.11.22 um 16:56 schrieb Michael S. Tsirkin:

On Fri, Oct 14, 2022 at 02:21:08PM +0100, Alex Bennée wrote:

During migration the virtio device state can be restored before we
restart the VM. As no devices can be running while the VM is paused it
makes sense to bail out early in that case.

This returns the order introduced in:

  9f6bcfd99f (hw/virtio: move vm_running check to virtio_device_started)

to what virtio-sock was doing longhand.

Signed-off-by: Alex Bennée 
Cc: Christian Borntraeger 



What happens now:

with this applied I get:

https://gitlab.com/mitsirkin/qemu/-/pipelines/685829158/failures

― ✀  ―
stderr:
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-reconnect,path=/tmp/vhost-test-QLKXU1/reconnect.sock,server=on: 
info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-QLKXU1/reconnect.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-connect-fail,path=/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: Failed to read msg 
header. Read 0 instead of 12. Original request 1.
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: vhost_backend_init 
failed: Protocol error
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: failed to init 
vhost_net for queue 0
qemu-system-arm: -netdev 
vhost-user,id=hs0,chardev=chr-connect-fail,vhostforce=on: info: QEMU waiting 
for connection on: 
disconnected:unix:/tmp/vhost-test-L9Q6U1/connect-fail.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: -chardev 
socket,id=chr-flags-mismatch,path=/tmp/vhost-test-3MO5U1/flags-mismatch.sock,server=on:
 info: QEMU waiting for connection on: 
disconnected:unix:/tmp/vhost-test-3MO5U1/flags-mismatch.sock,server=on
qemu-system-arm: Failed to write msg. Wrote -1 instead of 52.
qemu-system-arm: vhost_set_mem_table failed: Invalid argument (22)
qemu-system-arm: unable to start vhost net: 22: falling back on userspace virtio
vhost lacks feature mask 0x4000 for backend
qemu-system-arm: failed to init vhost_net for queue 0
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 2 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 3 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost_set_vring_call failed: Invalid argument (22)
qemu-system-arm: Failed to set msg fds.
qemu-system-arm: vhost_set_vring_call failed: Invalid argument (22)
qemu-system-arm: Failed to write msg. Wrote -1 instead of 20.
qemu-system-arm: vhost VQ 0 ring restore failed: -5: Input/output error (5)
qemu-system-arm: ../hw/virtio/virtio-bus.c:211: void 
virtio_bus_release_ioeventfd(VirtioBusState *): Assertion 
`bus->ioeventfd_grabbed != 0' fai

Re: [PATCH 00/44] KVM: Rework kvm_init() and hardware enabling

2022-11-03 Thread Christian Borntraeger


Am 03.11.22 um 00:18 schrieb Sean Christopherson:

Non-x86 folks, please test on hardware when possible.  I made a _lot_ of
mistakes when moving code around.  Thankfully, x86 was the trickiest code
to deal with, and I'm fairly confident that I found all the bugs I
introduced via testing.  But the number of mistakes I made and found on
x86 makes me more than a bit worried that I screwed something up in other
arch code.

This is a continuation of Chao's series to do x86 CPU compatibility checks
during virtualization hardware enabling[1], and of Isaku's series to try
and clean up the hardware enabling paths so that x86 (Intel specifically)
can temporarily enable hardware during module initialization without
causing undue pain for other architectures[2].  It also includes one patch
from another mini-series from Isaku that provides the less controversial
patches[3].

The main theme of this series is to kill off kvm_arch_init(),
kvm_arch_hardware_(un)setup(), and kvm_arch_check_processor_compat(), which
all originated in x86 code from way back when, and needlessly complicate
both common KVM code and architecture code.  E.g. many architectures don't
mark functions/data as __init/__ro_after_init purely because kvm_init()
isn't marked __init to support x86's separate vendor modules.

The idea/hope is that with those hooks gone (moved to arch code), it will
be easier for x86 (and other architectures) to modify their module init
sequences as needed without having to fight common KVM code.  E.g. I'm
hoping that ARM can build on this to simplify its hardware enabling logic,
especially the pKVM side of things.

There are bug fixes throughout this series.  They are more scattered than
I would usually prefer, but getting the sequencing correct was a gigantic
pain for many of the x86 fixes due to needing to fix common code in order
for the x86 fix to have any meaning.  And while the bugs are often fatal,
they aren't all that interesting for most users as they either require a
malicious admin or broken hardware, i.e. aren't likely to be encountered
by the vast majority of KVM users.  So unless someone _really_ wants a
particular fix isolated for backporting, I'm not planning on shuffling
patches.

Tested on x86.  Lightly tested on arm64.  Compile tested only on all other
architectures.


Some sniff tests seem to work ok on s390.

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

Re: [PATCH 00/44] KVM: Rework kvm_init() and hardware enabling

2022-11-03 Thread Christian Borntraeger


Am 03.11.22 um 00:18 schrieb Sean Christopherson:

Non-x86 folks, please test on hardware when possible.  I made a _lot_ of
mistakes when moving code around.  Thankfully, x86 was the trickiest code
to deal with, and I'm fairly confident that I found all the bugs I
introduced via testing.  But the number of mistakes I made and found on
x86 makes me more than a bit worried that I screwed something up in other
arch code.

This is a continuation of Chao's series to do x86 CPU compatibility checks
during virtualization hardware enabling[1], and of Isaku's series to try
and clean up the hardware enabling paths so that x86 (Intel specifically)
can temporarily enable hardware during module initialization without
causing undue pain for other architectures[2].  It also includes one patch
from another mini-series from Isaku that provides the less controversial
patches[3].

The main theme of this series is to kill off kvm_arch_init(),
kvm_arch_hardware_(un)setup(), and kvm_arch_check_processor_compat(), which
all originated in x86 code from way back when, and needlessly complicate
both common KVM code and architecture code.  E.g. many architectures don't
mark functions/data as __init/__ro_after_init purely because kvm_init()
isn't marked __init to support x86's separate vendor modules.

The idea/hope is that with those hooks gone (moved to arch code), it will
be easier for x86 (and other architectures) to modify their module init
sequences as needed without having to fight common KVM code.  E.g. I'm
hoping that ARM can build on this to simplify its hardware enabling logic,
especially the pKVM side of things.

There are bug fixes throughout this series.  They are more scattered than
I would usually prefer, but getting the sequencing correct was a gigantic
pain for many of the x86 fixes due to needing to fix common code in order
for the x86 fix to have any meaning.  And while the bugs are often fatal,
they aren't all that interesting for most users as they either require a
malicious admin or broken hardware, i.e. aren't likely to be encountered
by the vast majority of KVM users.  So unless someone _really_ wants a
particular fix isolated for backporting, I'm not planning on shuffling
patches.

Tested on x86.  Lightly tested on arm64.  Compile tested only on all other
architectures.


Some sniff tests seem to work ok on s390.

Re: [PATCH 1/1] tcg: add perfmap and jitdump

2022-10-28 Thread Christian Borntraeger


Am 12.10.22 um 07:18 schrieb Ilya Leoshkevich:

Add ability to dump /tmp/perf-.map and jit-.dump.
The first one allows the perf tool to map samples to each individual
translation block. The second one adds the ability to resolve symbol
names, line numbers and inspect JITed code.

Example of use:

 perf record qemu-x86_64 -perfmap ./a.out
 perf report

or

 perf record -k 1 qemu-x86_64 -jitdump ./a.out
 perf inject -j -i perf.data -o perf.data.jitted
 perf report -i perf.data.jitted

Co-developed-by: Vanderson M. do Rosario 
Co-developed-by: Alex Bennée 
Signed-off-by: Ilya Leoshkevich 


I think this would be awesome to have. I know our performance people do use 
this for Java a lot.

Re: qemu iotest 161 and make check

2022-10-26 Thread Christian Borntraeger





Am 31.03.22 um 10:25 schrieb Christian Borntraeger:



Am 31.03.22 um 09:44 schrieb Christian Borntraeger:



Am 21.02.22 um 11:27 schrieb Christian Borntraeger:


Am 10.02.22 um 18:44 schrieb Vladimir Sementsov-Ogievskiy:

10.02.2022 20:13, Thomas Huth wrote:

On 10/02/2022 15.51, Christian Borntraeger wrote:



Am 10.02.22 um 15:47 schrieb Vladimir Sementsov-Ogievskiy:

10.02.2022 10:57, Christian Borntraeger wrote:

Hello,

I do see spurious failures of 161 in our CI, but only when I use
make check with parallelism (-j).
I have not yet figured out which other testcase could interfere

@@ -34,6 +34,8 @@
  *** Commit and then change an option on the backing file

  Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=1048576
+qemu-img: TEST_DIR/t.IMGFMT.base: Failed to get "write" lock


FWIW, qemu_lock_fd_test returns -11 (EAGAIN)
and raw_check_lock_bytes spits this error.



And its coming from here (ret is 0)

int qemu_lock_fd_test(int fd, int64_t start, int64_t len, bool exclusive)
{
     int ret;
     struct flock fl = {
     .l_whence = SEEK_SET,
     .l_start  = start,
     .l_len    = len,
     .l_type   = exclusive ? F_WRLCK : F_RDLCK,
     };
     qemu_probe_lock_ops();
     ret = fcntl(fd, fcntl_op_getlk, );
     if (ret == -1) {
     return -errno;
     } else {
->    return fl.l_type == F_UNLCK ? 0 : -EAGAIN;
     }
}




Is this just some overload situation that we do not recover because we do not 
handle EAGAIN any special.


Restarted my investigation. Looks like the file lock from qemu is not fully 
cleaned up when the process is gone.
Something like
diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..b28a6c187c 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -403,4 +403,5 @@ _cleanup_qemu()
 unset QEMU_IN[$i]
 unset QEMU_OUT[$i]
 done
+sleep 0.5
 }


makes the problem go away.

Looks like we do use the OFD variant of the file lock, so any clone, fork etc 
will keep the lock.

So I tested the following:

diff --git a/tests/qemu-iotests/common.qemu b/tests/qemu-iotests/common.qemu
index 0f1fecc68e..01bdb05575 100644
--- a/tests/qemu-iotests/common.qemu
+++ b/tests/qemu-iotests/common.qemu
@@ -388,7 +388,7 @@ _cleanup_qemu()
 kill -KILL ${QEMU_PID} 2>/dev/null
 fi
 if [ -n "${QEMU_PID}" ]; then
-wait ${QEMU_PID} 2>/dev/null # silent kill
+wait 2>/dev/null # silent kill
 fi
 fi


And this also helps. Still trying to find out what clone/fork happens here.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 6861 matches

Mail list logo