Bug#1059899: systemd-resolved: systemd-resolved takes up all memory on certain PTR queries and is then oom-killed
Control: severity -1 normal Control: tags -1 moreinfo On Wed, 03 Jan 2024 12:02:40 +0100 Corin Langosch wrote: > Package: systemd > Version: 247.3-7+deb11u4 > Severity: critical > File: systemd-resolved > Justification: breaks the whole system Your logs show that it restarts just fine after the oom kill, so it is most definitely not "grave". Also, you did not attach your local resolved.conf. Also, oldstable is old. Try with backports or with upgrading to stable. -- Kind regards, Luca Boccassi
Bug#1059899:
I just disabled LLMNR by putting LLMNR=no in resolved.conf to see if it helps, but it still crashes as before. Jan 03 12:16:23 system1 systemd-resolved[863]: Got DNS stub UDP query packet for id 10122 Jan 03 12:16:23 system1 systemd-resolved[863]: Looking up RR for 54.49.125.74.in-addr.arpa IN A. Jan 03 12:16:23 system1 systemd-resolved[863]: Switching to system DNS server 192.168.0.. Jan 03 12:16:23 system1 systemd-resolved[863]: Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/resolve1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=4 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a Jan 03 12:16:23 system1 systemd-resolved[863]: Cache miss for 54.49.125.74.in-addr.arpa IN A Jan 03 12:16:23 system1 systemd-resolved[863]: Transaction 21270 for <54.49.125.74.in-addr.arpa IN A> scope dns on */*. Jan 03 12:16:23 system1 systemd-resolved[863]: Using feature level UDP+EDNS0 for transaction 21270. Jan 03 12:16:23 system1 systemd-resolved[863]: Using DNS server 192.168.0. for transaction 21270. Jan 03 12:16:23 system1 systemd-resolved[863]: Sending query packet with id 21270 of size 54. Jan 03 12:16:23 system1 systemd-resolved[863]: Processing query... Jan 03 12:16:23 system1 systemd-resolved[863]: Processing incoming packet on transaction 21270 (rcode=NXDOMAIN). Jan 03 12:16:23 system1 systemd-resolved[863]: Verified we get a response at feature level UDP+EDNS0 from DNS server 192.168.0.. Jan 03 12:16:23 system1 systemd-resolved[863]: Added NXDOMAIN cache entry for 54.49.125.74.in-addr.arpa IN ANY 1s Jan 03 12:16:23 system1 systemd-resolved[863]: Transaction 21270 for <54.49.125.74.in-addr.arpa IN A> on scope dns on */* now complete with from network (unsigned). Jan 03 12:16:25 system1 kernel: systemd-resolve invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 Jan 03 12:16:26 system1 kernel: CPU: 1 PID: 863 Comm: systemd-resolve Not tainted 5.10.0-27-amd64 #1 Debian 5.10.205-2 Jan 03 12:16:26 system1 kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Jan 03 12:16:26 system1 kernel: Call Trace: Jan 03 12:16:26 system1 kernel: dump_stack+0x6b/0x83 Jan 03 12:16:26 system1 kernel: dump_header+0x4a/0x1f4 Jan 03 12:16:26 system1 kernel: oom_kill_process.cold+0xb/0x10 Jan 03 12:16:26 system1 kernel: out_of_memory+0x1bd/0x4e0 Jan 03 12:16:26 system1 kernel: __alloc_pages_slowpath.constprop.0+0xbcc/0xc90 Jan 03 12:16:26 system1 kernel: __alloc_pages_nodemask+0x2de/0x310 Jan 03 12:16:26 system1 kernel: pagecache_get_page+0x175/0x390 Jan 03 12:16:26 system1 kernel: filemap_fault+0x6a2/0x900 Jan 03 12:16:26 system1 kernel: ? xas_load+0x5/0x80 Jan 03 12:16:26 system1 kernel: ext4_filemap_fault+0x2d/0x50 [ext4] Jan 03 12:16:26 system1 kernel: __do_fault+0x37/0x170 Jan 03 12:16:26 system1 kernel: handle_mm_fault+0x124d/0x1c00 Jan 03 12:16:26 system1 kernel: do_user_addr_fault+0x1b8/0x400 Jan 03 12:16:26 system1 kernel: exc_page_fault+0x78/0x160 Jan 03 12:16:26 system1 kernel: ? asm_exc_page_fault+0x8/0x30 Jan 03 12:16:26 system1 kernel: asm_exc_page_fault+0x1e/0x30 Jan 03 12:16:26 system1 kernel: RIP: 0033:0x7fafdd0896c0 Jan 03 12:16:26 system1 kernel: Code: Unable to access opcode bytes at RIP 0x7fafdd089696. Jan 03 12:16:26 system1 kernel: RSP: 002b:7ffce16e3588 EFLAGS: 00010246 Jan 03 12:16:26 system1 kernel: RAX: 55d2218a9250 RBX: 7ffce16e35d8 RCX: 0010 Jan 03 12:16:26 system1 kernel: RDX: 0200 RSI: RDI: 55d2218a9250 Jan 03 12:16:26 system1 kernel: RBP: R08: 55d2218a9250 R09: 7fafdd0f7be0 Jan 03 12:16:26 system1 kernel: R10: 006e R11: 0210 R12: 55d2218a9250 Jan 03 12:16:26 system1 kernel: R13: R14: 0028 R15: 55d2218a921c Jan 03 12:16:26 system1 kernel: Mem-Info: Jan 03 12:16:26 system1 kernel: active_anon:98 inactive_anon:417531 isolated_anon:0 active_file:22 inactive_file:7 isolated_file:0 unevictable:772 dirty:0 writeback:0 slab_reclaimable:2917 slab_unreclaimable:58826 mapped:15 shmem:923 pagetables:1262 bounce:0 free:13625 free_pcp:2 free_cma:0 Jan 03 12:16:26 system1 kernel: Node 0 active_anon:392kB inactive_anon:1670124kB active_file:88kB inactive_file:28kB unevictable:3088kB isolated(anon):0kB isolated(file):0kB mapped:60kB dirty:0kB writeback:0kB shmem:3692kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 415744kB writeback_tmp:0kB kernel_stack:2256kB all_unreclaimable? yes Jan 03 12:16:26 system1 kernel: Node 0 DMA free:8016kB min:360kB low:448kB high:536kB reserved_highatomic:0KB active_anon:0kB inactive_anon:7620kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15992kB managed:15908kB
Bug#1059899: systemd-resolved: systemd-resolved takes up all memory on certain PTR queries and is then oom-killed
Package: systemd Version: 247.3-7+deb11u4 Severity: critical File: systemd-resolved Justification: breaks the whole system X-Debbugs-Cc: cor...@gmx.de Dear Maintainer, systemd-resolved takes up all memory on certain PTR queries within only a few milliseconds and is then oom-killed. This sometimes renders the whole system unusable. I can reliably reproduce the issue using this simple command: ~# host 54.49.125.74.in-addr.arpa ;; connection timed out; no servers could be reached I already tried to debug the issue a little, please find the details below. If you need more information please let me know. ~# systemd-resolve --version systemd 247 (247.3-7+deb11u4) +PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=unified ~# cat /etc/systemd/resolved.conf [Resolve] DNS=192.168.0.1 Domains=intern.xxx xxx.com ~# tcpdump -i any -n port 53 11:35:31.012308 loIn IP 127.0.0.1.58906 > 127.0.0.53.53: 28856+ A? 54.49.125.74.in-addr.arpa. (43) 11:35:31.014406 vlan4 Out IP x.x.x.x.50491 > 192.168.0.1.53: 29042+ [1au] A? 54.49.125.74.in-addr.arpa. (54) 11:35:31.033018 vlan4 In IP 192.168.0.1.53 > x.x.x.x.50491: 29042 NXDomain 0/1/1 (114) Jan 03 11:35:31 system1 systemd-resolved[40859]: SELinux enabled state cached to: disabled Jan 03 11:35:31 system1 systemd-resolved[40859]: Positive Trust Anchors: Jan 03 11:35:31 system1 systemd-resolved[40859]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d Jan 03 11:35:31 system1 systemd-resolved[40859]: Negative trust anchors: 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal intranet lan local private test Jan 03 11:35:31 system1 systemd-resolved[40859]: Using system hostname 'system1'. Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link *, protocol dns, family * Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 9/vlan1000 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 8/vlan102 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 7/vlan101 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 6/vlan100 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 5/vlan98 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 4/vlan4 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 3/vlan3 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 2/vlan2 Jan 03 11:35:31 system1 systemd-resolved[40859]: Found new link 1/lo Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan1000, protocol llmnr, family AF_INET6 Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 3857 for scope llmnr on vlan1000/INET6. Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 51270us. Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan102, protocol llmnr, family AF_INET6 Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 15599 for scope llmnr on vlan102/INET6. Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 51008us. Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan101, protocol llmnr, family AF_INET6 Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 61000 for scope llmnr on vlan101/INET6. Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 81403us. Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan100, protocol llmnr, family AF_INET6 Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 62244 for scope llmnr on vlan100/INET6. Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 72225us. Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan98, protocol llmnr, family AF_INET6 Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 43244 for scope llmnr on vlan98/INET6. Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 68513us. Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan4, protocol llmnr, family AF_INET6 Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 65377 for scope llmnr on vlan4/INET6. Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr transaction for 79327us. Jan 03 11:35:31 system1 systemd-resolved[40859]: New scope on link vlan3, protocol llmnr, family AF_INET6 Jan 03 11:35:31 system1 systemd-resolved[40859]: Transaction 41217 for scope llmnr on vlan3/INET6. Jan 03 11:35:31 system1 systemd-resolved[40859]: Delaying llmnr