Should have replied to the list!
-------- Forwarded Message --------
Subject: Re: [qubes-users] Re: AppVms being killed on resume due to
clock skew too large
Date: Sat, 1 Feb 2020 11:49:29 +0000
From: Mike Keehan <m...@keehan.net>
To: mmo...@disroot.org
On 2/1/20 10:27 AM, mmo...@disroot.org wrote:
Same problem again, this time not related to any socket closure.
Apparently related to systemd:
[41911.199732] audit: type=1104 audit(1580516883.707:119): pid=4917
uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred
grantors=pam_rootok acct="root" exe="/usr/lib/qubes/qrexec-agent"
hostname=? addr=? terminal=? res=success'
[41920.252871] clocksource: timekeeping watchdog on CPU0: Marking
clocksource 'tsc' as unstable because the skew is too large:
[41920.252927] clocksource: 'xen' wd_now: 2a1620baf67a wd_last:
2a140e3c5f9f mask: ffffffffffffffff
[41920.252972] clocksource: 'tsc' cs_now: ffffff88779d4270 cs_last:
5083a288ea9a mask: ffffffffffffffff
[41920.253013] tsc: Marking TSC unstable due to clocksource watchdog
[41921.161370] audit: type=1100 audit(1580516893.670:120): pid=4955
uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:authentication
grantors=pam_rootok acct="root" exe="/usr/lib/qubes/qrexec-agent"
hostname=? addr=? terminal=? res=success'
[41921.163039] audit: type=1103 audit(1580516893.672:121): pid=4955
uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred
grantors=pam_rootok acct="root" exe="/usr/lib/qubes/qrexec-agent"
hostname=? addr=? terminal=? res=success'
[41921.176874] audit: type=1105 audit(1580516893.686:122): pid=4955
uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:session_open
grantors=pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_umask,pam_lastlog
acct="root" exe="/usr/lib/qubes/qrexec-agent" hostname=? addr=?
terminal=? res=success'
[41922.205481] audit: type=1106 audit(1580552389.038:123): pid=4955
uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:session_close
grantors=pam_keyinit,pam_limits,pam_systemd,pam_unix,pam_umask,pam_lastlog
acct="root" exe="/usr/lib/qubes/qrexec-agent" hostname=? addr=?
terminal=? res=success'
[41922.205554] audit: type=1104 audit(1580552389.038:124): pid=4955
uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred
grantors=pam_rootok acct="root" exe="/usr/lib/qubes/qrexec-agent"
hostname=? addr=? terminal=? res=success'
*[41932.321374] systemd[4919]: segfault at 640550f11920 ip
0000640550345cbd sp 00007ffd40e80440 error 6 in systemd[6405502f6000+b7000]
[41932.321420] Code: 24 28 02 00 00 48 85 c9 74 0f 48 89 81 28 02 00 00
49 8b 84 24 28 02 00 00 48 85 c0 0f 84 a0 07 00 00 49 8b 94 24 20 02 00
00 <48> 89 90 20 02 00 00 49 c7 84 24 28 02 00 00 00 00 00 00 49 c7 84*
[41932.321515] audit: type=1701 audit(1580552399.156:125): auid=0 uid=0
gid=0 ses=4 pid=4919 comm="systemd" exe="/usr/lib/systemd/systemd"
sig=11 res=1
[41932.336794] audit: type=1130 audit(1580552399.171:126): pid=1 uid=0
auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@0-4990-0
comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=?
terminal=? res=success'
[41932.627105] audit: type=1131 audit(1580552399.456:127): pid=1 uid=0
auid=4294967295 ses=4294967295 msg='unit=user@0 comm="systemd"
exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
[41932.636551] audit: type=1131 audit(1580552399.471:128): pid=1 uid=0
auid=4294967295 ses=4294967295 msg='unit=user-runtime-dir@0
comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=?
terminal=? res=success'
[41932.661359] audit: type=1131 audit(1580552399.495:129): pid=1 uid=0
auid=4294967295 ses=4294967295 msg='unit=systemd-coredump@0-4990-0
comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=?
terminal=? res=success'
[41934.482123] BUG: unable to handle kernel NULL pointer dereference at
0000000000000080
[41934.482143] PGD 0 P4D 0
[41934.482150] Oops: 0000 [#1] SMP PTI
[41934.482159] CPU: 0 PID: 5002 Comm: Compositor Tainted: G O
4.19.94-1.pvops.qubes.x86_64 #1
[41934.482178] RIP: 0010:mem_cgroup_page_lruvec+0x28/0x50
[41934.482189] Code: 00 00 0f 1f 44 00 00 0f 1f 44 00 00 48 8b 47 38 48
8b 17 48 85 c0 48 0f 44 05 dc d1 0c 01 48 c1 ea 36 48 8b 84 d0 48 0a 00
00 <48> 3b b0 80 00 00 00 75 12 f3 c3 48 8d 86 a0 a1 02 00 48 3b b0 80
[41934.482222] RSP: 0018:ffffc900011d3aa8 EFLAGS: 00010046
[41934.482232] RAX: 0000000000000000 RBX: ffffffff82369cc0 RCX:
ffffc900011d3ae8
[41934.482246] RDX: 0000000000000000 RSI: ffff8880f9fd5000 RDI:
ffffea0002adec00
[41934.482265] RBP: ffff88802f7e6fb8 R08: ffffc900011d3ae8 R09:
000000000001eb39
[41934.482279] R10: 00000000000fa000 R11: ffffffffffffffff R12:
ffff8880f9fd5000
[41934.482294] R13: ffffea0002adec00 R14: 0000000000000014 R15:
ffff88802f7e7000
[41934.482308] FS: 0000000000000000(0000) GS:ffff8880f5a00000(0000)
knlGS:0000000000000000
[41934.482323] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[41934.482335] CR2: 0000000000000080 CR3: 000000003c9da001 CR4:
00000000003606f0
[41934.482351] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[41934.482365] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[41934.482380] Call Trace:
[41934.482388] release_pages+0x12c/0x4b0
[41934.482397] tlb_flush_mmu_free+0x36/0x50
[41934.482406] unmap_page_range+0x8f0/0xd00
[41934.482415] unmap_vmas+0x4c/0xa0
[41934.482423] exit_mmap+0xb5/0x1a0
[41934.482432] mmput+0x5f/0x140
[41934.482443] flush_old_exec+0x597/0x6c0
[41934.482451] ? load_elf_phdrs+0x97/0xb0
[41934.482460] load_elf_binary+0x3d9/0x1224
[41934.482468] ? get_acl+0x1a/0x100
[41934.482477] search_binary_handler+0xa6/0x1c0
[41934.482487] __do_execve_file.isra.34+0x587/0x7e0
[41934.482498] __x64_sys_execve+0x34/0x40
[41934.482506] do_syscall_64+0x5b/0x190
[41934.482515] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[41934.482526] RIP: 0033:0x7c1fb7d15acb
[41934.482535] Code: Bad RIP value.
[41934.482543] RSP: 002b:00007c1fa7361b18 EFLAGS: 00000246 ORIG_RAX:
000000000000003b
[41934.482557] RAX: ffffffffffffffda RBX: 00007c1fa7361b40 RCX:
00007c1fb7d15acb
[41934.482572] RDX: 00007c1fa9b5f800 RSI: 00007c1fa7361b20 RDI:
00007c1fb7a22cd0
[41934.482586] RBP: 00007c1fa7361ba0 R08: 00007c1fa7361b38 R09:
00007c1fa7361b60
[41934.482600] R10: 00007c1fa7361b20 R11: 0000000000000246 R12:
00007c1fa7361bd8
[41934.482615] R13: 0000000000000000 R14: 000000005e355001 R15:
00007c1fa7361bf0
[41934.482630] Modules linked in: ip6table_filter ip6_tables
xt_conntrack ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c intel_rapl crct10dif_pclmul
crc32_pclmul crc32c_intel xen_netfront ghash_clmulni_intel
intel_rapl_perf pcspkr u2mfn(O) xenfs xen_privcmd xen_gntdev
xen_gntalloc xen_blkback xen_evtchn overlay xen_blkfront
[41934.482694] CR2: 0000000000000080
[41934.482703] ---[ end trace f587889938477959 ]---
[41934.482714] RIP: 0010:mem_cgroup_page_lruvec+0x28/0x50
[41934.482724] Code: 00 00 0f 1f 44 00 00 0f 1f 44 00 00 48 8b 47 38 48
8b 17 48 85 c0 48 0f 44 05 dc d1 0c 01 48 c1 ea 36 48 8b 84 d0 48 0a 00
00 <48> 3b b0 80 00 00 00 75 12 f3 c3 48 8d 86 a0 a1 02 00 48 3b b0 80
[41934.482756] RSP: 0018:ffffc900011d3aa8 EFLAGS: 00010046
[41934.482766] RAX: 0000000000000000 RBX: ffffffff82369cc0 RCX:
ffffc900011d3ae8
[41934.482780] RDX: 0000000000000000 RSI: ffff8880f9fd5000 RDI:
ffffea0002adec00
[41934.482794] RBP: ffff88802f7e6fb8 R08: ffffc900011d3ae8 R09:
000000000001eb39
[41934.482808] R10: 00000000000fa000 R11: ffffffffffffffff R12:
ffff8880f9fd5000
[41934.482822] R13: ffffea0002adec00 R14: 0000000000000014 R15:
ffff88802f7e7000
[41934.482837] FS: 0000000000000000(0000) GS:ffff8880f5a00000(0000)
knlGS:0000000000000000
[41934.482851] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[41934.482863] CR2: 00007c1fb7d15aa1 CR3: 000000003c9da001 CR4:
00000000003606f0
[41934.482877] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[41934.482891] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[41934.482905] Kernel panic - not syncing: Fatal exception
[41936.108632] Shutting down cpus with NMI
[41936.108774] Kernel Offset: disabled
Any idea what might be causing this issue?
Thanks.
January 31, 2020 5:08 PM, mmo...@disroot.org <mailto:mmo...@disroot.org>
wrote:
Many thanks for the suggestion!
I'm not using any proprietary modules of any sort, below are the
only modules that I are loaded in the AppVM that was killed (as you
can see nothing really special):
Module Size Used by
fuse 126976 3
ip6table_filter 16384 1
ip6_tables 32768 1 ip6table_filter
xt_conntrack 16384 2
ipt_MASQUERADE 16384 1
iptable_nat 16384 1
nf_nat_ipv4 16384 2 ipt_MASQUERADE,iptable_nat
nf_nat 36864 1 nf_nat_ipv4
nf_conntrack 163840 4 xt_conntrack,nf_nat,ipt_MASQUERADE,nf_nat_ipv4
nf_defrag_ipv6 20480 1 nf_conntrack
nf_defrag_ipv4 16384 1 nf_conntrack
libcrc32c 16384 2 nf_conntrack,nf_nat
intel_rapl 24576 0
crct10dif_pclmul 16384 0
crc32_pclmul 16384 0
crc32c_intel 24576 1
ghash_clmulni_intel 16384 0
xen_netfront 32768 0
intel_rapl_perf 16384 0
pcspkr 16384 0
xenfs 16384 1
u2mfn 16384 0
xen_privcmd 24576 17 xenfs
xen_gntdev 24576 1
xen_gntalloc 16384 5
xen_blkback 49152 0
xen_evtchn 16384 6
overlay 122880 1
xen_blkfront 45056 6
The closesure of the socket probably is related with borgmatic (that
I'm using as my backup mechanism for the AppVms). But I don't think
its related, since I this enabled only in a few machines, and even
the ones that are not using borgmatic are terminated on resume.
I'm runing out of ideas on this. What I do noticed though is that if
the resume is done immediately after the suspend the resume works
fins without any AppVM being killed, which seems to indicate perhaps
an issue with the clock (that's the only thing that comes to mind,
specially given the warning above) but I'm not sure if this is the
root cause.
Any more suggestions would be really appreciated!
As this is a different crash, maybe it is memory corruption.
Some information about which VM template is crashing may help,
and any VMs that never crash?
What type of machine are you using?
--
You received this message because you are subscribed to the Google Groups
"qubes-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to qubes-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/qubes-users/e9a18a62-2b2b-cd8f-c4f6-f120dc84f8f0%40keehan.net.