Hi all,
so, latest news.
System lost access to the NVMe again and could recover from that only
after powercycling. Pings, until that powercycle, worked so I assume the
NIC and software above it were still functional.
Rebooted into the 6.5 backported kernel, downloaded the newest BIOS,
noticed the NIC getting lost, wrote the BIOS image to USB key, rebooted
into the UEFI / BIOS control tool, flashed the newest firmware, set all
defaults and conservative power saving settings and booted into Debian
again.
Kernel is
# uname -a
Linux Zwerg 6.5.0-0.deb12.4-amd64 #1 SMP PREEMPT_DYNAMIC Debian
6.5.10-1~bpo12+1 (2023-11-23) x86_64 GNU/Linux
These are the latest such events:
Jan 27 09:44:53 Zwerg kernel: igc 0000:0a:00.0 eno1: PCIe link lost,
device now detached
Jan 27 09:48:05 Zwerg kernel: igc 0000:0a:00.0 (unnamed net_device)
(uninitialized): PCIe link lost, device now detached
Jan 27 09:52:16 Zwerg kernel: igc 0000:0a:00.0 (unnamed net_device)
(uninitialized): PCIe link lost, device now detached
Feb 01 04:19:17 Zwerg kernel: igc 0000:0a:00.0 eno1: PCIe link lost,
device now detached
Feb 01 14:43:03 Zwerg kernel: igc 0000:0a:00.0 (unnamed net_device)
(uninitialized): PCIe link lost, device now detached
Feb 08 18:33:38 Zwerg kernel: igc 0000:0a:00.0 eno1: PCIe link lost,
device now detached
Feb 08 19:00:32 Zwerg kernel: igc 0000:0b:00.0 eno1: PCIe link lost,
device now detached
Feb 08 19:02:38 Zwerg kernel: igc 0000:0b:00.0 (unnamed net_device)
(uninitialized): PCIe link lost, device now detached
I think it's safe to say that the actual kernel version does not have an
effect on those events.
Naturally, the NVMe connectivity losses are not logged but I believe it
might be an interesting thing to see if I can capture that. Perhaps
sending system logs to USB storage might work. However, I think it would
be important to understand if this ticket's topic is a matter of the igc
module, or perhaps about the power or PCIe management functionality (of
which I know even less).
The big question: What can I do to help further pinpointing this problem?
Thanks,
Arno
--
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück