On Tue, 11 Feb 2020 04:57:52 +0100
dftxbs3e <dftxb...@free.fr> wrote:

> Hello,
> 
> I took a snapshot of a ppc64 (big endian) VM from a ppc64 (little endian) 
> host using `virsh snapshot-create-as --domain <name> --name <name>`
> 

A big endian guest doing XIVE ?!? I'm pretty sure we didn't do much testing, if
any, on such a setup... What distro is used in the VM ?

> Then I restarted my system and tried restoring the snapshot:
> 
> # virsh snapshot-revert --domain <name> --snapshotname <name>
> error: internal error: process exited while connecting to monitor: 
> 2020-02-11T03:18:08.110582Z qemu-system-ppc64: KVM_SET_DEVICE_ATTR failed: 
> Group 3 attr 0x0000000000001309: Device or resource busy
> 2020-02-11T03:18:08.110605Z qemu-system-ppc64: error while loading state for 
> instance 0x0 of device 'spapr'
> 2020-02-11T03:18:08.112843Z qemu-system-ppc64: Error -1 while loading VM state
> 

This indicates that QEMU failed to configure the source targeting
for the HW interrupt 0x1309, which is an MSI interrupt used by
a PCI device plugged in the default PHB. Especially, -EBUSY means

    -EBUSY:  No CPU available to serve interrupt

> And dmesg shows each time the restore command is executed:
> 
> [  180.176606] WARNING: CPU: 16 PID: 5528 at 
> arch/powerpc/kvm/book3s_xive.c:345 xive_try_pick_queue+0x40/0xb8 [kvm]

This warning means that we have vCPU without a configured event queue.

Since kvmppc_xive_select_target() is trying all vCPUs before bailing out
with -EBUSY, you might be seeing several WARNINGs (1 per vCPU) in dmesg,
correct ?

Anyway, this looks wrong since QEMU is supposed to have already configured
the event queues at this point... Not sure what's happening here...

> [  180.176608] Modules linked in: vhost_net vhost tap kvm_hv kvm xt_CHECKSUM 
> xt_MASQUERADE nf_nat_tftp nf_conntrack_tftp tun bridge 8021q garp mrp stp llc 
> rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_REJECT 
> nf_reject_ipv6 ip6t_rpfilter ipt_REJECT nf_reject_ipv4 xt_conntrack 
> ebtable_nat ebtable_broute ip6table_nat ip6table_mangle ip6table_raw 
> ip6table_security iptable_nat nf_nat iptable_mangle iptable_raw 
> iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nfnetlink 
> ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc 
> raid1 at24 regmap_i2c snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg 
> joydev snd_hda_codec snd_hda_core ofpart snd_hwdep crct10dif_vpmsum snd_seq 
> ipmi_powernv powernv_flash ipmi_devintf snd_seq_device mtd ipmi_msghandler 
> rtc_opal snd_pcm opal_prd i2c_opal snd_timer snd soundcore lz4 lz4_compress 
> zram ip_tables xfs libcrc32c dm_crypt amdgpu ast drm_vram_helper mfd_core 
> i2c_algo_bit gpu_sched drm_kms_helper mpt3sas
> [  180.176652]  syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm 
> vmx_crypto tg3 crc32c_vpmsum nvme raid_class scsi_transport_sas nvme_core 
> drm_panel_orientation_quirks i2c_core fuse
> [  180.176663] CPU: 16 PID: 5528 Comm: qemu-system-ppc Not tainted 
> 5.4.17-200.fc31.ppc64le #1
> [  180.176665] NIP:  c00800000a883c80 LR: c00800000a886db8 CTR: 
> c00800000a88a9e0
> [  180.176667] REGS: c000000767a17890 TRAP: 0700   Not tainted  
> (5.4.17-200.fc31.ppc64le)
> [  180.176668] MSR:  9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 48224248 
>  XER: 20040000
> [  180.176673] CFAR: c00800000a886db4 IRQMASK: 0 
>                GPR00: c00800000a886db8 c000000767a17b20 c00800000a8aed00 
> c0002005468a4480 
>                GPR04: 0000000000000000 0000000000000000 0000000000000000 
> 0000000000000001 
>                GPR08: c0002007142b2400 c0002007142b2400 0000000000000000 
> c00800000a8910f0 
>                GPR12: c00800000a88a488 c0000007fffed000 0000000000000000 
> 0000000000000000 
>                GPR16: 0000000149524180 00007ffff39bda78 00007ffff39bda30 
> 000000000000025c 
>                GPR20: 0000000000000000 0000000000000003 c0002006f13a0000 
> 0000000000000000 
>                GPR24: 0000000000001359 0000000000000000 c0000002f8c96c38 
> c0000002f8c80000 
>                GPR28: 0000000000000000 c0002006f13a0000 c0002006f13a4038 
> c000000767a17be4 
> [  180.176688] NIP [c00800000a883c80] xive_try_pick_queue+0x40/0xb8 [kvm]
> [  180.176693] LR [c00800000a886db8] kvmppc_xive_select_target+0x100/0x210 
> [kvm]
> [  180.176694] Call Trace:
> [  180.176696] [c000000767a17b20] [c000000767a17b70] 0xc000000767a17b70 
> (unreliable)
> [  180.176701] [c000000767a17b70] [c00800000a88b420] 
> kvmppc_xive_native_set_attr+0xf98/0x1760 [kvm]
> [  180.176705] [c000000767a17cc0] [c00800000a86392c] 
> kvm_device_ioctl+0xf4/0x180 [kvm]
> [  180.176710] [c000000767a17d10] [c0000000005380b0] do_vfs_ioctl+0xaa0/0xd90
> [  180.176712] [c000000767a17dd0] [c000000000538464] sys_ioctl+0xc4/0x110
> [  180.176716] [c000000767a17e20] [c00000000000b9d0] system_call+0x5c/0x68
> [  180.176717] Instruction dump:
> [  180.176719] 794ad182 0b0a0000 2c290000 41820080 89490010 2c0a0000 41820074 
> 78883664 
> [  180.176723] 7d094214 e9480070 7d470074 78e7d182 <0b070000> 2c2a0000 
> 41820054 81480078 
> [  180.176727] ---[ end trace 056a6dd275e20684 ]---
> 
> Let me know if I can provide more information

Yeah, QEMU command line, QEMU version, guest kernel version can help. Also,
what kind of workload is running inside the guest ? Is this easy to reproduce ?

Cheers,

--
Greg

> 
> Thanks
> 

Attachment: pgpYgEVZCoWGk.pgp
Description: OpenPGP digital signature

Reply via email to