Processed: Re: Bug#1036644: linux-image-6.1.0-9-amd64: System crashes. Netconsole reports CPUs not responding to MCE broadcast

2023-05-23 Thread Debian Bug Tracking System
Processing control commands:

> found -1 6.1.25-1
Bug #1036644 [src:linux] linux-image-6.1.0-9-amd64: System crashes. Netconsole 
reports CPUs not responding to MCE broadcast
Marked as found in versions linux/6.1.25-1.
> retitle -1 Kernel panic - not syncing: Timeout: Not all CPUs entered 
> broadcast exception handler
Bug #1036644 [src:linux] linux-image-6.1.0-9-amd64: System crashes. Netconsole 
reports CPUs not responding to MCE broadcast
Changed Bug title to 'Kernel panic - not syncing: Timeout: Not all CPUs entered 
broadcast exception handler' from 'linux-image-6.1.0-9-amd64: System crashes. 
Netconsole reports CPUs not responding to MCE broadcast'.

-- 
1036644: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1036644
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1036644: linux-image-6.1.0-9-amd64: System crashes. Netconsole reports CPUs not responding to MCE broadcast

2023-05-23 Thread Diederik de Haas
Control: found -1 6.1.25-1
Control: retitle -1 Kernel panic - not syncing: Timeout: Not all CPUs entered 
broadcast exception handler

On Tuesday, 23 May 2023 18:49:00 CEST Olivier Berger wrote:
> It used to work fine with 6.1.0-7 but has had problems with the 2 later
> updates of the testing kernel.

The stack traces should be useful for someone who understands those (which
isn't me), but I did notice several other items:

- [  465.284645] GPT: Use GNU Parted to correct GPT errors
That happened after you plugged in an USB drive?
I would follow that advice, but it would be useful to get that USB drive
'out of the equation'.
Does the issue also occur when that USB drive isn't used?
The kernel seems to assign both sda and sdb before settling on sda(1)?
Not sure what to make of that, but it doesn't look good

- [  535.857315] EXT4-fs (dm-0): recovery complete
I can understand a FS recovery when you're dealing with a freeze/crash,
but I find the timing a 'bit' unusual. After 9.5 minutes, I doubt it's the
primary/boot drive (and we had the USB drive before that), so where
is that coming from?

- [  543.576681] systemd-journald[428]: Sent WATCHDOG=1 notification
I'm not really sure what that means, but afaik a watchdog is used to
(automatically) reboot the machine if the system hangs.
So seeing that message numerous times, is worrisome. And it looks like it
doesn't do its actual job?

- BIOS T70 Ver. 01.13.01 03/30/2023
Can you check whether there is a newer BIOS version available?
I believe 'NMI' is BIOS related, so it may have an effect.

signature.asc
Description: This is a digitally signed message part.


Bug#1036644: linux-image-6.1.0-9-amd64: System crashes. Netconsole reports CPUs not responding to MCE broadcast

2023-05-23 Thread Olivier Berger
Hi.

Just in order to provide a bit more useful hints, maybe, the latest version 
working fine is linux-image-6.1.0-7-amd64 as 6.1.20-2.

Sorry about the lack of clarity in the initial report.

Le Tue, May 23, 2023 at 06:49:00PM +0200, Olivier Berger a écrit :
> 
> I'm experiencing crashes (computer reset or completely shutting down) without 
> much details available on why. It used to work fine with 6.1.0-7 but has had 
> problems with the 2 later updates of the testing kernel.
> 
> I've managed to get a log of the kernel panic with netconsole (otherwise 
> wouldn't get any hints whatsoever in logs on disks after restarting), bellow.
> 
> I guess this is nasty as being close to the freeze. I've had the issue for a 
> few days now, but only managed to test a netconsole remote log today.
> 
> It seems to me that the crash mainly happen when I'm away from the laptop for 
> several minutes, so maybe related to some kind of energy saving stuff...
> 
> Hope this provides enough details to help.
> 

-- 
Olivier BERGER
https://www-public.imtbs-tsp.eu/~berger_o/ - OpenPGP 2048R/0xF9EAE3A65819D7E8
Ingenieur Recherche - Dept INF
Institut Mines-Telecom, Telecom SudParis, Evry (France)



Bug#1036644: linux-image-6.1.0-9-amd64: System crashes. Netconsole reports CPUs not responding to MCE broadcast

2023-05-23 Thread Olivier Berger
Package: src:linux
Version: 6.1.27-1
Severity: normal

Hi.

I'm experiencing crashes (computer reset or completely shutting down) without 
much details available on why. It used to work fine with 6.1.0-7 but has had 
problems with the 2 later updates of the testing kernel.

I've managed to get a log of the kernel panic with netconsole (otherwise 
wouldn't get any hints whatsoever in logs on disks after restarting), bellow.

I guess this is nasty as being close to the freeze. I've had the issue for a 
few days now, but only managed to test a netconsole remote log today.

It seems to me that the crash mainly happen when I'm away from the laptop for 
several minutes, so maybe related to some kind of energy saving stuff...

Hope this provides enough details to help.

[  394.735702] netpoll: netconsole: local port 
[  394.735711] netpoll: netconsole: local IPv4 address 192.168.0.23
[  394.735715] netpoll: netconsole: interface 'enp2s0'
[  394.735717] netpoll: netconsole: remote port 
[  394.735719] netpoll: netconsole: remote IPv4 address 192.168.0.47
[  394.735722] netpoll: netconsole: remote ethernet address 38:2c:4a:b1:63:94
[  394.735819] printk: console [netcon0] enabled
[  394.735825] netconsole: network logging started
[  463.655009] usb 3-6: new high-speed USB device number 8 using xhci_hcd
[  463.659448] systemd-journald[428]: Sent WATCHDOG=1 notification.
[  463.943099] usb 3-6: New USB device found, idVendor=1307, idProduct=0190, 
bcdDevice= 1.00
[  463.943133] usb 3-6: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  463.943144] usb 3-6: Product: USB Mass Storage Device
[  463.943153] usb 3-6: Manufacturer: USBest Technology
[  463.943160] usb 3-6: SerialNumber: 00027F
[  463.974560] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  463.974717] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  463.987184] SCSI subsystem initialized
[  463.990687] usb-storage 3-6:1.0: USB Mass Storage device detected
[  463.990771] scsi host0: usb-storage 3-6:1.0
[  463.990859] usbcore: registered new interface driver usb-storage
[  463.992482] usbcore: registered new interface driver uas
[  464.995952] scsi 0:0:0:0: Direct-Access Ut190USB2FlashStorage 0.00 
PQ: 0 ANSI: 2
[  464.996613] scsi 0:0:0:1: Direct-Access Ut190SD0StorageDevice 0.00 
PQ: 0 ANSI: 2
[  465.008300] scsi 0:0:0:0: Attached scsi generic sg0 type 0
[  465.008343] scsi 0:0:0:1: Attached scsi generic sg1 type 0
[  465.014353] sd 0:0:0:0: [sda] 7897088 512-byte logical blocks: (4.04 GB/3.77 
GiB)
[  465.014619] sd 0:0:0:1: [sdb] Media removed, stopped polling
[  465.014756] sd 0:0:0:0: [sda] Write Protect is off
[  465.014764] sd 0:0:0:0: [sda] Mode Sense: 00 00 00 00
[  465.014804] sd 0:0:0:1: [sdb] Attached SCSI removable disk
[  465.014951] sd 0:0:0:0: [sda] Asking for cache data failed
[  465.014957] sd 0:0:0:0: [sda] Assuming drive cache: write through
[  465.284600] GPT:Primary header thinks Alt. header is not at the end of the 
disk.
[  465.284627] GPT:2590719 != 7897087
[  465.284634] GPT:Alternate GPT header not at the end of the disk.
[  465.284640] GPT:2590719 != 7897087
[  465.284645] GPT: Use GNU Parted to correct GPT errors.
[  465.284659]  sda: sda1
[  465.285144] sd 0:0:0:0: [sda] Attached SCSI removable disk
[  474.111368] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  497.264500] sda: detected capacity change from 7897088 to 0
[  502.045711] usb 3-6: USB disconnect, device number 8
[  519.695345] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  535.857315] EXT4-fs (dm-0): recovery complete
[  535.858056] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Quota 
mode: none.
[  543.576681] systemd-journald[428]: Sent WATCHDOG=1 notification.
[  551.263395] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  634.375963] systemd-journald[428]: Sent WATCHDOG=1 notification.
[  725.578095] systemd-journald[428]: Sent WATCHDOG=1 notification.
[  845.577721] systemd-journald[428]: Sent WATCHDOG=1 notification.
[  871.117193] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  905.577391] systemd-journald[428]: Sent WATCHDOG=1 notification.
[  905.620289] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  905.623541] systemd-journald[428]: Successfully sent stream file descriptor 
to service manager.
[  995.577111] systemd-journald[428]: Sent WATCHDOG=1 notification.
[ 1085.576193] systemd-journald[428]: Sent WATCHDOG=1 notification.
[ 1205.575316] systemd-journald[428]: Sent WATCHDOG=1 notification.
[ 1265.574866] systemd-journald[428]: Sent WATCHDOG=1 notification.
[ 1305.267119] mce: CPUs not responding to MCE broadcast (may include false 
positives): 0-1,3-5,7
[ 1305.267121] mce: CPUs not responding to MCE broadcast (may include 

Re: Information About The Linux Kernel Maintenance In Debian

2023-05-23 Thread maximilian attems
Dear Federico,

> can't easily find the following information:
> 
> - Criteria to select a kernel version for a Debian release. It looks to me you
>   are following LTS releases, but as you know kernel LTS is a moving target in
>   terms of duration. So, how you choose?

this depends on the Debian release cycle, which the Debian release
team sets. This is announced in the debian-release mailing list.
Once the release date cycle are known, the Debian kernel
team tries to optimise to have a recent enough LTS release balanced
with conservative exposure to enough hardware.
 
> - How much a Debian kernel diverges from kernel.org release overtime?

The stable kernel does not in general add hardware backports.
The amount in debian depends on the vested interests. If we
get enough bug reports or someone making it easy in gitlab
to merge newer hardware support that happens too. The driver
has to be released in mainline to qualify.
 
> - I see you explain how to build and run any kernel from kernel.org, but I do
>   not see and discouragement in doing so. Is this because you do not see any
>   known incompatibilities ?

For security maintenance we encourage to use the Debian one.
Of course if you have in house capabilities to follow whatever
LTS release you choose there will not be trouble (unless you set HZ to
some funny value or disable features glibc assumes).

We did optimize certain architectures for size but due to the
involved time constraints this got dropped.

Hope this helps, do not hesitate to follow-up.


Thank you for your interest (:
maximilian



Bug#1036633: firmware-iwlwifi: Wireless AC 7265 lacks D3cold support

2023-05-23 Thread Giovanni
Package: firmware-iwlwifi
Version: 20230310-1~exp2
Severity: normal
Tags: upstream
X-Debbugs-Cc: u...@junocomputers.com

Dear Maintainer,

If D3Cold is enabled from BIOS the tablet boots without Wifi. The only two ways
to enable wifi support is either by disabling D3Cold from BIOS (not ideal) or
adding pcie_port_pm=off to grub

[   12.681303] iwlwifi :01:00.0: Detected Intel(R) Dual Band Wireless AC
7265, REV=0x210



-- System Information:
Debian Release: 12.0
  APT prefers testing-security
  APT policy: (500, 'testing-security'), (500, 'unstable'), (500, 'testing'), 
(1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 6.3.0-0-amd64 (SMP w/4 CPU threads; PREEMPT)
Locale: LANG=C, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

firmware-iwlwifi depends on no packages.

firmware-iwlwifi recommends no packages.

Versions of packages firmware-iwlwifi suggests:
ii  initramfs-tools  0.142

-- no debconf information



Information About The Linux Kernel Maintenance In Debian

2023-05-23 Thread Federico Vaga

Dear kernel maintainers,

I'm a CERN employee currently evaluating Debian as a possible solution for our
systems in use to control particle accelerators. I would like to know more about
how the Debian community handles the Linux kernel integration. In particular, I
can't easily find the following information:

- Criteria to select a kernel version for a Debian release. It looks to me you
  are following LTS releases, but as you know kernel LTS is a moving target in
  terms of duration. So, how you choose?

- How much a Debian kernel diverges from kernel.org release overtime?

- I see you explain how to build and run any kernel from kernel.org, but I do
  not see and discouragement in doing so. Is this because you do not see any
  known incompatibilities ?

Thanks :)

--
Federico Vaga