Hi there, I am having trouble with a new build system. It works normal and stable until I put extreme stress on it, e.g. using all 12 cores with stress tool.
System will suddenly loose network connection and become unresponsive. Only a reset works. I am not sure what is going on, but it is reproducible: Put stress on the system and it fails. It seems, that something is getting out of step. Stuff below I found in the logs. I tried quite a bit, even upgraded to bookworm, to see if the newer kernel works. If anyone knows how to analyze this issue, it would be very helpful. Kind regards Christian 2023-05-20T20:12:17.054224+02:00 diskstation kernel: [ 1303.236428] --- ---------[ cut here ]------------ 2023-05-20T20:12:17.054234+02:00 diskstation kernel: [ 1303.236430] NETDEV WATCHDOG: enp3s0 (r8169): transmit queue 0 timed out 2023-05-20T20:12:17.054235+02:00 diskstation kernel: [ 1303.236437] WARNING: CPU: 5 PID: 2411 at net/sched/sch_generic.c:525 dev_watchdog+0x207/0x210 2023-05-20T20:12:17.054236+02:00 diskstation kernel: [ 1303.236442] Modules linked in: eq3_char_loop(OE) rpi_rf_mod_led(OE) ledtrig_timer ledtrig_default_on xt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo xt_addrtype br_netfilter bridge stp llc overlay ip6t_rt nft_chain_nat nf_nat xt_set xt_tcpmss xt_tcpudp xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_compat nf_tables ip_set_hash_ip ip_set binfmt_misc nfnetlink nls_ascii nls_cp437 vfat fat amdgpu iwlmvm btusb intel_rapl_msr btrtl intel_rapl_common btbcm btintel edac_mce_amd btmtk mac80211 snd_hda_codec_realtek bluetooth snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi gpu_sched kvm_amd drm_buddy libarc4 snd_hda_intel drm_display_helper snd_intel_dspcfg snd_intel_sdw_acpi iwlwifi kvm cec snd_hda_codec jitterentropy_rng irqbypass rc_core snd_hda_core cfg80211 snd_hwdep drm_ttm_helper snd_pcm ttm drbg wmi_bmof rapl ccp snd_timer ansi_cprng drm_kms_helper sp5100_tco snd pcspkr ecdh_generic rng_core i2c_algo_bit watchdog soundcore k10temp rfkill hb_rf_usb_2(OE) ecc 2023-05-20T20:12:17.054240+02:00 diskstation kernel: [ 1303.236494] generic_raw_uart(OE) acpi_cpufreq button joydev evdev sg nct6775 nct6775_core drm hwmon_vid fuse loop efi_pstore configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c crc32c_generic dm_crypt dm_mod hid_generic usbhid hid sd_mod crc32_pclmul crc32c_intel ahci ghash_clmulni_intel sha512_ssse3 libahci xhci_pci sha512_generic xhci_hcd r8169 nvme realtek libata aesni_intel nvme_core t10_pi crypto_simd mdio_devres usbcore scsi_mod crc64_rocksoft_generic cryptd libphy crc64_rocksoft crc_t10dif i2c_piix4 crct10dif_generic crct10dif_pclmul crc64 crct10dif_common usb_common scsi_common video wmi gpio_amdpt gpio_generic 2023-05-20T20:12:17.054241+02:00 diskstation kernel: [ 1303.236534] CPU: 5 PID: 2411 Comm: stress Tainted: G OE 6.1.0-9- amd64 #1 Debian 6.1.27-1 2023-05-20T20:12:17.054241+02:00 diskstation kernel: [ 1303.236536] Hardware name: To Be Filled By O.E.M. B550M-ITX/ac/B550M-ITX/ac, BIOS L2.62 01/31/2023 2023-05-20T20:12:17.054242+02:00 diskstation kernel: [ 1303.236537] RIP: 0010:dev_watchdog+0x207/0x210 2023-05-20T20:12:17.054242+02:00 diskstation kernel: [ 1303.236540] Code: 00 e9 40 ff ff ff 48 89 df c6 05 ff 5f 3d 01 01 e8 be 79 f9 ff 44 89 e9 48 89 de 48 c7 c7 c8 16 9b a8 48 89 c2 e8 09 d2 86 ff <0f> 0b e9 22 ff ff ff 66 90 0f 1f 44 00 00 55 53 48 89 fb 48 8b 6f 2023-05-20T20:12:17.054243+02:00 diskstation kernel: [ 1303.236541] RSP: 0000:ffffa831c345fdc8 EFLAGS: 00010286 2023-05-20T20:12:17.054243+02:00 diskstation kernel: [ 1303.236543] RAX: 0000000000000000 RBX: ffff91a3c1410000 RCX: 0000000000000000 2023-05-20T20:12:17.054243+02:00 diskstation kernel: [ 1303.236544] RDX: 0000000000000103 RSI: ffffffffa893fa66 RDI: 00000000ffffffff 2023-05-20T20:12:17.054244+02:00 diskstation kernel: [ 1303.236545] RBP: ffff91a3c1410488 R08: 0000000000000000 R09: ffffa831c345fc38 2023-05-20T20:12:17.054244+02:00 diskstation kernel: [ 1303.236546] R10: 0000000000000003 R11: ffff91aafe27afe8 R12: ffff91a3c14103dc 2023-05-20T20:12:17.054245+02:00 diskstation kernel: [ 1303.236547] R13: 0000000000000000 R14: ffffffffa7e2e7a0 R15: ffff91a3c1410488 2023-05-20T20:12:17.054245+02:00 diskstation kernel: [ 1303.236548] FS: 00007f169849d740(0000) GS:ffff91aade340000(0000) knlGS:0000000000000000 2023-05-20T20:12:17.054246+02:00 diskstation kernel: [ 1303.236550] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2023-05-20T20:12:17.054246+02:00 diskstation kernel: [ 1303.236551] CR2: 000055d05c3f4000 CR3: 0000000103cf2000 CR4: 0000000000750ee0 2023-05-20T20:12:17.054246+02:00 diskstation kernel: [ 1303.236552] PKRU: 55555554 2023-05-20T20:12:17.054247+02:00 diskstation kernel: [ 1303.236553] Call Trace: 2023-05-20T20:12:17.054247+02:00 diskstation kernel: [ 1303.236554] <TASK> 2023-05-20T20:12:17.054248+02:00 diskstation kernel: [ 1303.236557] ? pfifo_fast_reset+0x140/0x140 2023-05-20T20:12:17.054248+02:00 diskstation kernel: [ 1303.236559] call_timer_fn+0x27/0x130 2023-05-20T20:12:17.054248+02:00 diskstation kernel: [ 1303.236562] __run_timers+0x21c/0x2a0 2023-05-20T20:12:17.054249+02:00 diskstation kernel: [ 1303.236565] run_timer_softirq+0x2b/0x50 2023-05-20T20:12:17.054249+02:00 diskstation kernel: [ 1303.236567] __do_softirq+0xf0/0x2fe 2023-05-20T20:12:17.054249+02:00 diskstation kernel: [ 1303.236570] __irq_exit_rcu+0xc7/0x130 2023-05-20T20:12:17.054250+02:00 diskstation kernel: [ 1303.236573] sysvec_apic_timer_interrupt+0x52/0xc0 2023-05-20T20:12:17.054250+02:00 diskstation kernel: [ 1303.236576] asm_sysvec_apic_timer_interrupt+0x16/0x20 2023-05-20T20:12:17.054251+02:00 diskstation kernel: [ 1303.236578] RIP: 0033:0x7f16984e085c 2023-05-20T20:12:17.054251+02:00 diskstation kernel: [ 1303.236579] Code: 48 89 44 24 08 31 c0 f0 0f b1 15 fb 3e 19 00 75 3d 48 8d 74 24 04 48 8d 3d f1 1f 19 00 e8 1c 04 00 00 31 c0 87 05 e0 3e 19 00 <83> f8 01 7f 2f 48 63 44 24 04 48 8b 54 24 08 64 48 2b 14 25 28 00 2023-05-20T20:12:17.054252+02:00 diskstation kernel: [ 1303.236581] RSP: 002b:00007fffb2c4cca0 EFLAGS: 00000246 2023-05-20T20:12:17.054252+02:00 diskstation kernel: [ 1303.236582] RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00007f169867221c 2023-05-20T20:12:17.054253+02:00 diskstation kernel: [ 1303.236583] RDX: 00007f1698672228 RSI: 00007fffb2c4cca4 RDI: 00007f1698672840 2023-05-20T20:12:17.054253+02:00 diskstation kernel: [ 1303.236584] RBP: 00000000000080e8 R08: 00007f1698672228 R09: 00007f1698672260 2023-05-20T20:12:17.054253+02:00 diskstation kernel: [ 1303.236585] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000 2023-05-20T20:12:17.054254+02:00 diskstation kernel: [ 1303.236586] R13: 0000565167761004 R14: 0000565167761a78 R15: 000000000000000b 2023-05-20T20:12:17.054254+02:00 diskstation kernel: [ 1303.236588] </TASK> 2023-05-20T20:12:17.054255+02:00 diskstation kernel: [ 1303.236589] --- [ end trace 0000000000000000 ]--- 2023-05-20T20:12:17.086199+02:00 diskstation kernel: [ 1303.270878] r8169 0000:03:00.0 enp3s0: rtl_rxtx_empty_cond == 0 (loop: 42, delay: 100).