Bug#1059899: systemd-resolved: systemd-resolved takes up all memory on certain PTR queries and is then oom-killed

2024-01-03 Thread Luca Boccassi
Control: severity -1 normal
Control: tags -1 moreinfo

On Wed, 03 Jan 2024 12:02:40 +0100 Corin Langosch 
wrote:
> Package: systemd
> Version: 247.3-7+deb11u4
> Severity: critical
> File: systemd-resolved
> Justification: breaks the whole system

Your logs show that it restarts just fine after the oom kill, so it is
most definitely not "grave". Also, you did not attach your local
resolved.conf.

Also, oldstable is old. Try with backports or with upgrading to stable.

-- 
Kind regards,
Luca Boccassi



Bug#1059899:

2024-01-03 Thread Corin Langosch
I just disabled LLMNR by putting LLMNR=no in resolved.conf to see if it helps, 
but it still crashes as before.


Jan 03 12:16:23 system1 systemd-resolved[863]: Got DNS stub UDP query packet 
for id 10122
Jan 03 12:16:23 system1 systemd-resolved[863]: Looking up RR for 
54.49.125.74.in-addr.arpa IN A.
Jan 03 12:16:23 system1 systemd-resolved[863]: Switching to system DNS server 
192.168.0..
Jan 03 12:16:23 system1 systemd-resolved[863]: Sent message type=signal 
sender=n/a destination=n/a path=/org/freedesktop/resolve1 
interface=org.freedesktop.DBus.Properties member=PropertiesChanged
cookie=4 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Jan 03 12:16:23 system1 systemd-resolved[863]: Cache miss for 
54.49.125.74.in-addr.arpa IN A
Jan 03 12:16:23 system1 systemd-resolved[863]: Transaction 21270 for 
<54.49.125.74.in-addr.arpa IN A> scope dns on */*.
Jan 03 12:16:23 system1 systemd-resolved[863]: Using feature level UDP+EDNS0 
for transaction 21270.
Jan 03 12:16:23 system1 systemd-resolved[863]: Using DNS server 192.168.0. for 
transaction 21270.
Jan 03 12:16:23 system1 systemd-resolved[863]: Sending query packet with id 
21270 of size 54.
Jan 03 12:16:23 system1 systemd-resolved[863]: Processing query...
Jan 03 12:16:23 system1 systemd-resolved[863]: Processing incoming packet on 
transaction 21270 (rcode=NXDOMAIN).
Jan 03 12:16:23 system1 systemd-resolved[863]: Verified we get a response at 
feature level UDP+EDNS0 from DNS server 192.168.0..
Jan 03 12:16:23 system1 systemd-resolved[863]: Added NXDOMAIN cache entry for 
54.49.125.74.in-addr.arpa IN ANY 1s
Jan 03 12:16:23 system1 systemd-resolved[863]: Transaction 21270 for 
<54.49.125.74.in-addr.arpa IN A> on scope dns on */* now complete with 
 from network (unsigned).
Jan 03 12:16:25 system1 kernel: systemd-resolve invoked oom-killer: 
gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Jan 03 12:16:26 system1 kernel: CPU: 1 PID: 863 Comm: systemd-resolve Not 
tainted 5.10.0-27-amd64 #1 Debian 5.10.205-2
Jan 03 12:16:26 system1 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 
2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Jan 03 12:16:26 system1 kernel: Call Trace:
Jan 03 12:16:26 system1 kernel:  dump_stack+0x6b/0x83
Jan 03 12:16:26 system1 kernel:  dump_header+0x4a/0x1f4
Jan 03 12:16:26 system1 kernel:  oom_kill_process.cold+0xb/0x10
Jan 03 12:16:26 system1 kernel:  out_of_memory+0x1bd/0x4e0
Jan 03 12:16:26 system1 kernel:  __alloc_pages_slowpath.constprop.0+0xbcc/0xc90
Jan 03 12:16:26 system1 kernel:  __alloc_pages_nodemask+0x2de/0x310
Jan 03 12:16:26 system1 kernel:  pagecache_get_page+0x175/0x390
Jan 03 12:16:26 system1 kernel:  filemap_fault+0x6a2/0x900
Jan 03 12:16:26 system1 kernel:  ? xas_load+0x5/0x80
Jan 03 12:16:26 system1 kernel:  ext4_filemap_fault+0x2d/0x50 [ext4]
Jan 03 12:16:26 system1 kernel:  __do_fault+0x37/0x170
Jan 03 12:16:26 system1 kernel:  handle_mm_fault+0x124d/0x1c00
Jan 03 12:16:26 system1 kernel:  do_user_addr_fault+0x1b8/0x400
Jan 03 12:16:26 system1 kernel:  exc_page_fault+0x78/0x160
Jan 03 12:16:26 system1 kernel:  ? asm_exc_page_fault+0x8/0x30
Jan 03 12:16:26 system1 kernel:  asm_exc_page_fault+0x1e/0x30
Jan 03 12:16:26 system1 kernel: RIP: 0033:0x7fafdd0896c0
Jan 03 12:16:26 system1 kernel: Code: Unable to access opcode bytes at RIP 
0x7fafdd089696.
Jan 03 12:16:26 system1 kernel: RSP: 002b:7ffce16e3588 EFLAGS: 00010246
Jan 03 12:16:26 system1 kernel: RAX: 55d2218a9250 RBX: 7ffce16e35d8 
RCX: 0010
Jan 03 12:16:26 system1 kernel: RDX: 0200 RSI:  
RDI: 55d2218a9250
Jan 03 12:16:26 system1 kernel: RBP:  R08: 55d2218a9250 
R09: 7fafdd0f7be0
Jan 03 12:16:26 system1 kernel: R10: 006e R11: 0210 
R12: 55d2218a9250
Jan 03 12:16:26 system1 kernel: R13:  R14: 0028 
R15: 55d2218a921c
Jan 03 12:16:26 system1 kernel: Mem-Info:
Jan 03 12:16:26 system1 kernel: active_anon:98 inactive_anon:417531 
isolated_anon:0
 active_file:22 inactive_file:7 isolated_file:0
 unevictable:772 dirty:0 writeback:0
 slab_reclaimable:2917 slab_unreclaimable:58826
 mapped:15 shmem:923 pagetables:1262 bounce:0
 free:13625 free_pcp:2 free_cma:0
Jan 03 12:16:26 system1 kernel: Node 0 active_anon:392kB 
inactive_anon:1670124kB active_file:88kB inactive_file:28kB unevictable:3088kB 
isolated(anon):0kB isolated(file):0kB mapped:60kB dirty:0kB
writeback:0kB shmem:3692kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 
415744kB writeback_tmp:0kB kernel_stack:2256kB all_unreclaimable? yes
Jan 03 12:16:26 system1 kernel: Node 0 DMA free:8016kB min:360kB low:448kB 
high:536kB reserved_highatomic:0KB active_anon:0kB inactive_anon:7620kB 
active_file:0kB inactive_file:0kB unevictable:0kB
writepending:0kB present:15992kB managed:15908kB 

Bug#1059899: systemd-resolved: systemd-resolved takes up all memory on certain PTR queries and is then oom-killed

2024-01-03 Thread Corin Langosch
Package: systemd
Version: 247.3-7+deb11u4
Severity: critical
File: systemd-resolved
Justification: breaks the whole system
X-Debbugs-Cc: cor...@gmx.de

Dear Maintainer,

systemd-resolved takes up all memory on certain PTR queries within only a few 
milliseconds and is then oom-killed. This sometimes renders the whole system 
unusable.

I can reliably reproduce the issue using this simple command:

~# host 54.49.125.74.in-addr.arpa
;; connection timed out; no servers could be reached

I already tried to debug the issue a little, please find the details below. If 
you need more information please let me know.

~# systemd-resolve --version
systemd 247 (247.3-7+deb11u4)
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP 
+GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN 
+PCRE2 default-hierarchy=unified

~# cat /etc/systemd/resolved.conf 
[Resolve]
DNS=192.168.0.1
Domains=intern.xxx xxx.com

~# tcpdump -i any -n port 53
11:35:31.012308 loIn  IP 127.0.0.1.58906 > 127.0.0.53.53: 28856+ A? 
54.49.125.74.in-addr.arpa. (43)
11:35:31.014406 vlan4 Out IP x.x.x.x.50491 > 192.168.0.1.53: 29042+ [1au] A? 
54.49.125.74.in-addr.arpa. (54)
11:35:31.033018 vlan4 In  IP 192.168.0.1.53 > x.x.x.x.50491: 29042 NXDomain 
0/1/1 (114)

Jan 03 11:35:31 system1 systemd-resolved[40859]: SELinux enabled state cached 
to: disabled
Jan 03 11:35:31 system1 systemd-resolved[40859]: Positive Trust Anchors:
Jan 03 11:35:31 system1 systemd-resolved[40859]: . IN DS 20326 8 2 
e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Jan 03 11:35:31 system1 systemd-resolved[40859]: Negative trust anchors: 
10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 
19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 
23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 
27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 
31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal 
intranet lan local private test
Jan 03 11:35:31 system1 systemd-resolved[40859]: Using system hostname 
'system1'.
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link *, protocol 
dns, family *
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 9/vlan1000
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 8/vlan102
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 7/vlan101
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 6/vlan100
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 5/vlan98
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 4/vlan4
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 3/vlan3
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 2/vlan2
Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 1/lo
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan1000, 
protocol llmnr, family AF_INET6
Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 3857 for  scope llmnr on vlan1000/INET6.
Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 
51270us.
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan102, 
protocol llmnr, family AF_INET6
Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 15599 for  scope llmnr on vlan102/INET6.
Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 
51008us.
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan101, 
protocol llmnr, family AF_INET6
Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 61000 for  scope llmnr on vlan101/INET6.
Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 
81403us.
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan100, 
protocol llmnr, family AF_INET6
Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 62244 for  scope llmnr on vlan100/INET6.
Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 
72225us.
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan98, 
protocol llmnr, family AF_INET6
Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 43244 for  scope llmnr on vlan98/INET6.
Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 
68513us.
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan4, 
protocol llmnr, family AF_INET6
Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 65377 for  scope llmnr on vlan4/INET6.
Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 
79327us.
Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan3, 
protocol llmnr, family AF_INET6
Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 41217 for  scope llmnr on vlan3/INET6.
Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr