Bug#989714: kdump-tools is broken out of the box

2024-02-01 Thread Dominique MARTINET
Hi,

reviving this...

(Rich, sorry for double mail - my initial reply incorrectly replied to
your mail which didn't have 989...@bugs.debian.org anywhere to properly
tag... Hopefully this will get through better)

Rich Ercolani wrote on Fri, Jun 11, 2021 at 03:43:17AM -0400:
> I installed kdump-tools to take a crash dump, rebooted, verified the 
> crashkernel was configured,
> triggered the problem I wanted to examine and a dump...the machine became 
> entirely unresponsive
> over ssh or local console (kind of expected) but didn't print any sign it was 
> doing anything
> like booting the crashkernel (bad).
> 
> I left it for 15 minutes, and nothing changed, so I hard rebooted it, and 
> tried again, same result.
> 
> So I tried installing kdump-tools and then using echo 'c' | sudo tee 
> /proc/syrq-trigger  on bullseye,
> same outcome. Same on jessie/x86_64 (with manual configuration of 
> crashkernel= in the grub config).

Also got bitten by this.
What's quite horrible is that when it happened on the real machine I
wanted to debug there was no sign it was doing anything -- the HDMI
screen setup probably didn't have time to happen on crash kernel to be
able to print anything, so even connecting a screen wouldn't help.

I also misread the 384M:-128M syntax to 384M@128M (second digit being
location in the later case) so tried to increase the first value which
obviously had no impact... and then tried in a VM at which point serial
works and it was clear enough, but the default experience was just
horrible, especially since the system never came back.

We probably ought to add 'panic=30' (or some arbitrary time) to
KDUMP_CMDLINE_APPEND's defaut value.


> So I looted part of the crashkernel= setting from the Ubuntu system 
> (crashkernel=512M-:192M was theirs,
> I used 384M-:192M) - no change. Tried 384M-:256M, and it worked. So I tried 
> theirs verbatim, and it
> also worked every time.
> 
> So maybe we need different defaults on at least x86_64 systems?

I haven't tried with less memory, but I'd say we can make use of the
range syntax to provide bigger values when the system has more than a
few GB of ram at least.
I can spend a bit of time to try in a VM with various values, but
something like crashkernel=512M-4G:192M,4G-64G:256M,64G-:384M is
probably sensible?
(lowest value coming from ubuntu's settings, would need to test how much
is needed for a system with 384MB but I'd be reluctant to take half of
its ram for crashkernel)

> (I specify x86_64 because using 512M-:192M breaks crashkernel more on my i386 
> testbeds.)

(Can't help about i386 though)

Thanks,
--
Dominique Martinet



Bug#989714: kdump-tools is broken out of the box

2021-06-11 Thread Rich Ercolani
Package: kdump-tools
Version: 1:1.6.5-1
Severity: important

Dear Maintainer,

(This part also applies to jessie/x86_64 and bullseye/x86_64, in addition to 
buster/x86_64.)

I installed kdump-tools to take a crash dump, rebooted, verified the 
crashkernel was configured,
triggered the problem I wanted to examine and a dump...the machine became 
entirely unresponsive
over ssh or local console (kind of expected) but didn't print any sign it was 
doing anything
like booting the crashkernel (bad).

I left it for 15 minutes, and nothing changed, so I hard rebooted it, and tried 
again, same result.

So I tried installing kdump-tools and then using echo 'c' | sudo tee 
/proc/syrq-trigger  on bullseye,
same outcome. Same on jessie/x86_64 (with manual configuration of crashkernel= 
in the grub config).

So I booted Ubuntu 20.04/x86_64 and tried this experiment, to make sure my 
expectations weren't
off-base - nope, works as expected.

So I looted part of the crashkernel= setting from the Ubuntu system 
(crashkernel=512M-:192M was theirs,
I used 384M-:192M) - no change. Tried 384M-:256M, and it worked. So I tried 
theirs verbatim, and it
also worked every time.

So maybe we need different defaults on at least x86_64 systems?

(I specify x86_64 because using 512M-:192M breaks crashkernel more on my i386 
testbeds.)

- Rich

-- System Information:
Debian Release: 10.9
  APT prefers stable-updates
  APT policy: (1000, 'stable-updates'), (1000, 'stable'), (900, 
'testing-debug'), (900, 'testing'), (800, 'unstable-debug'), (800, 'unstable'), 
(500, 'stable-debug'), (500, 'proposed-updates-debug'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.19.0-16-amd64 (SMP w/16 CPU cores)
Kernel taint flags: TAINT_PROPRIETARY_MODULE, TAINT_OOT_MODULE, 
TAINT_UNSIGNED_MODULE
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages kdump-tools depends on:
ii  bsdmainutils   11.1.2+b1
ii  debconf [debconf-2.0]  1.5.71
ii  file   1:5.35-4+deb10u2
ii  kexec-tools1:2.0.18-1
ii  linux-base 4.6
ii  lsb-base   10.2019051400
ii  makedumpfile   1:1.6.5-1
ii  ucf3.0038+nmu1

Versions of packages kdump-tools recommends:
ii  initramfs-tools-core  0.133+deb10u1

kdump-tools suggests no packages.

-- Configuration Files:
/etc/default/kdump-tools changed [not included]

-- debconf information:
* kdump-tools/use_kdump: true