Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-02-20 Thread Michael Biebl

On Wed, 12 Jan 2022 19:54:15 +0100 Michael Biebl  wrote:

Am 12.01.22 um 19:34 schrieb Christian Weeks:
> I have attached the journal from the 10 minutes prior. I was trying to mount a
> CD in an external CD rom drive at the time. It seems that something went wrong
> and killed systemd. I apparently didn't notice for another 10 days. /facepalm
> 
> I have the core file, if you want me to analyze it somehow, just tell me what to

> do.

To get a useful backtrace, you'll probably need a chroot, where you 
install systemd 249.7-1 + dbgsym packages.

Then run gdb /lib/systemd/systemd 

Then run "bt full" to get a backtrace.


Did you have a chance to produce such a backtrace?

Without it, there is practically no chance that this can be further 
investigated.


Michael


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-01-12 Thread Michael Biebl

Am 12.01.22 um 19:34 schrieb Christian Weeks:

I have attached the journal from the 10 minutes prior. I was trying to mount a
CD in an external CD rom drive at the time. It seems that something went wrong
and killed systemd. I apparently didn't notice for another 10 days. /facepalm

I have the core file, if you want me to analyze it somehow, just tell me what to
do.


To get a useful backtrace, you'll probably need a chroot, where you 
install systemd 249.7-1 + dbgsym packages.

Then run gdb /lib/systemd/systemd 

Then run "bt full" to get a backtrace.





OpenPGP_signature
Description: OpenPGP digital signature


Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-01-12 Thread Christian Weeks
I have attached the journal from the 10 minutes prior. I was trying to mount a
CD in an external CD rom drive at the time. It seems that something went wrong
and killed systemd. I apparently didn't notice for another 10 days. /facepalm

I have the core file, if you want me to analyze it somehow, just tell me what to
do.

Christian

On Wed, 2022-01-12 at 19:21 +0100, Michael Biebl wrote:
> 
> Am 12.01.22 um 18:20 schrieb Christian Weeks:
> > I don't see anything in the journal, I've had a fairly long look. I do not
> > have
> > the coredump utility installed.
> > As I have mentioned, rebooting fixed whatever caused the problem during the
> > upgrade, so I have no idea how I can help you further.
> > 
> > In looking at my running system since, I notice that systemd isn't
> > defaulting to
> > running, so I guess the problem was actually that dbus was failing to
> > activate
> > systemd properly, during the upgrade.
> 
> Once PID 1 has been frozen, you can't reactivate it. dbus tried to talk 
> to it but eventually timed out. dbus is not supposed to "start" PID1.
> 
> The only option to recover from such a scenario is to reboot 
> (forcefully) as you did.
> 
> > I have found a core file, but it's dated from 10 days ago, not today, which
> > is
> > weird. There was no activity on the computer at the time, I believe, and
> > this
> > went unnoticed, perhaps for 10 days?!
> 
> I'd say this observation is correct.
> 
> If you still have the journal, maybe attach the preceeding 10 mins 
> before the crash, i.e. all log messages from
> Jan  2 10:00:00 until  Jan  2 10:14:48

Jan 02 10:05:01 cheesypuffs CRON[1425173]: pam_unix(cron:session): session 
opened for user root(uid=0) by (uid=0)
Jan 02 10:05:01 cheesypuffs CRON[1425174]: (root) CMD (command -v debian-sa1 > 
/dev/null && debian-sa1 1 1)
Jan 02 10:05:01 cheesypuffs CRON[1425173]: pam_unix(cron:session): session 
closed for user root
Jan 02 10:06:59 cheesypuffs NetworkManager[958]:   [1641136019.2397] 
device (wlp41s0): set-hw-addr: set MAC address to 36:F4:36:8B:48:E4 (scanning)
Jan 02 10:06:59 cheesypuffs NetworkManager[958]:   [1641136019.2890] 
device (wlp41s0): supplicant interface state: inactive -> disconnected
Jan 02 10:06:59 cheesypuffs NetworkManager[958]:   [1641136019.2890] 
device (p2p-dev-wlp41s0): supplicant management interface state: inactive -> 
disconnected
Jan 02 10:06:59 cheesypuffs NetworkManager[958]:   [1641136019.2941] 
device (wlp41s0): supplicant interface state: disconnected -> inactive
Jan 02 10:06:59 cheesypuffs NetworkManager[958]:   [1641136019.2941] 
device (p2p-dev-wlp41s0): supplicant management interface state: disconnected 
-> inactive
Jan 02 10:07:41 cheesypuffs kernel: usb 1-2: USB disconnect, device number 3
Jan 02 10:07:41 cheesypuffs kernel: usb 1-2.4: USB disconnect, device number 9
Jan 02 10:07:41 cheesypuffs kernel: usb 1-2.4.4: USB disconnect, device number 
12
Jan 02 10:07:41 cheesypuffs kernel: usb 1-2.4.4.2: USB disconnect, device 
number 13
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2: USB disconnect, device number 3
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2.4: USB disconnect, device number 4
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2.4.4: USB disconnect, device number 5
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2: new SuperSpeed USB device number 6 
using xhci_hcd
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2: New USB device found, 
idVendor=2109, idProduct=0817, bcdDevice= 3.64
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2: New USB device strings: Mfr=1, 
Product=2, SerialNumber=0
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2: Product: USB3.0 Hub
Jan 02 10:07:42 cheesypuffs kernel: usb 2-2: Manufacturer: VIA Labs, Inc.
Jan 02 10:07:42 cheesypuffs kernel: hub 2-2:1.0: USB hub found
Jan 02 10:07:42 cheesypuffs kernel: hub 2-2:1.0: 4 ports detected
Jan 02 10:07:42 cheesypuffs upowerd[1429]: treating change event as add on 
/sys/devices/pci:00/:00:01.2/:20:00.0/:21:08.0/:2a:00.1/usb2/2-2
Jan 02 10:07:43 cheesypuffs kernel: usb 1-2: new high-speed USB device number 
14 using xhci_hcd
Jan 02 10:07:43 cheesypuffs kernel: usb 1-2: New USB device found, 
idVendor=2109, idProduct=2817, bcdDevice= 3.64
Jan 02 10:07:43 cheesypuffs kernel: usb 1-2: New USB device strings: Mfr=1, 
Product=2, SerialNumber=0
Jan 02 10:07:43 cheesypuffs kernel: usb 1-2: Product: USB2.0 Hub
Jan 02 10:07:43 cheesypuffs kernel: usb 1-2: Manufacturer: VIA Labs, Inc.
Jan 02 10:07:43 cheesypuffs kernel: hub 1-2:1.0: USB hub found
Jan 02 10:07:43 cheesypuffs kernel: hub 1-2:1.0: 4 ports detected
Jan 02 10:07:43 cheesypuffs upowerd[1429]: treating change event as add on 
/sys/devices/pci:00/:00:01.2/:20:00.0/:21:08.0/:2a:00.1/usb1/1-2
Jan 02 10:07:43 cheesypuffs kernel: usb 2-2.4: new SuperSpeed USB device number 
7 using xhci_hcd
Jan 02 10:07:43 cheesypuffs kernel: usb 2-2.4: New USB device found, 
idVendor=2109, idProduct=0817, bcdDevice=90.23
Jan 02 10:07:43 cheesypuffs kernel: usb 2-2.4: New USB device strings

Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-01-12 Thread Michael Biebl


Am 12.01.22 um 18:20 schrieb Christian Weeks:

I don't see anything in the journal, I've had a fairly long look. I do not have
the coredump utility installed.
As I have mentioned, rebooting fixed whatever caused the problem during the
upgrade, so I have no idea how I can help you further.

In looking at my running system since, I notice that systemd isn't defaulting to
running, so I guess the problem was actually that dbus was failing to activate
systemd properly, during the upgrade.


Once PID 1 has been frozen, you can't reactivate it. dbus tried to talk 
to it but eventually timed out. dbus is not supposed to "start" PID1.


The only option to recover from such a scenario is to reboot 
(forcefully) as you did.



I have found a core file, but it's dated from 10 days ago, not today, which is
weird. There was no activity on the computer at the time, I believe, and this
went unnoticed, perhaps for 10 days?!


I'd say this observation is correct.

If you still have the journal, maybe attach the preceeding 10 mins 
before the crash, i.e. all log messages from

Jan  2 10:00:00 until  Jan  2 10:14:48


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-01-12 Thread Michael Biebl


Control: found -1 249.7-1
Control: severity -1 important

Am 12.01.22 um 18:20 schrieb Christian Weeks:

I don't see anything in the journal, I've had a fairly long look. I do not have
the coredump utility installed.
As I have mentioned, rebooting fixed whatever caused the problem during the
upgrade, so I have no idea how I can help you further.

In looking at my running system since, I notice that systemd isn't defaulting to
running, so I guess the problem was actually that dbus was failing to activate
systemd properly, during the upgrade.

I have found a core file, but it's dated from 10 days ago, not today, which is
weird. There was no activity on the computer at the time, I believe, and this
went unnoticed, perhaps for 10 days?!


As said, systemd freezes execution when it crashes.
If PID 1 actually crashed, the kernel would panic and you'd notice :-)


Jan  2 10:14:48 cheesypuffs kernel: [336844.954825] systemd[1]: segfault at 18
ip 55bd29c926ea sp 7ffdcd0c56c8 error 4 in systemd[55bd29c38000+d9000]


ls -l /core
-rw--- 1 root root 22482944 Jan  2 10:14 /core

I can share this core file with you if you wish (how?), though perhaps it's not
so related to the upgrade anymore?


Given the timestamps match, it's pretty certain, that the core file 
belongs to the segfault.
It also shows that PID 1 crashing is not actually affecting the running 
services. As long as you don't interact with systemd (e.g. via 
systemctl), your system should continue to run fine.

I'm thus downgrading the severity.
As it is not actually related to the upgrade, I'm marking it as found in 
249.7-1.


You can attach the core file to the bug report (gzipped) or mail it to 
me directly. I will try to see if a backtrace reveals something.


Michael


OpenPGP_signature
Description: OpenPGP digital signature


Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-01-12 Thread Christian Weeks
I don't see anything in the journal, I've had a fairly long look. I do not have
the coredump utility installed.
As I have mentioned, rebooting fixed whatever caused the problem during the
upgrade, so I have no idea how I can help you further.

In looking at my running system since, I notice that systemd isn't defaulting to
running, so I guess the problem was actually that dbus was failing to activate
systemd properly, during the upgrade.

I have found a core file, but it's dated from 10 days ago, not today, which is
weird. There was no activity on the computer at the time, I believe, and this
went unnoticed, perhaps for 10 days?!

Jan  2 10:14:48 cheesypuffs kernel: [336844.954825] systemd[1]: segfault at 18
ip 55bd29c926ea sp 7ffdcd0c56c8 error 4 in systemd[55bd29c38000+d9000]


ls -l /core
-rw--- 1 root root 22482944 Jan  2 10:14 /core

I can share this core file with you if you wish (how?), though perhaps it's not
so related to the upgrade anymore?

Christian
On Wed, 2022-01-12 at 18:01 +0100, Michael Biebl wrote:
> Control: tags -1 + moreinfo
> 
> Hello,
> 
> systemd freezes execution when it crashes (you should see a 
> corresponding log message in the journal).
> 
> For this bug report to be actionable, we will need at the very least a 
> backtrace of the crash.
> In case you had systemd-coredump installed, coredumpctl should show you.
> Or maybe you still have a core file in /.
> 
> 
> Regards,
> Michael
> 



Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-01-12 Thread Michael Biebl

Control: tags -1 + moreinfo

Hello,

systemd freezes execution when it crashes (you should see a 
corresponding log message in the journal).


For this bug report to be actionable, we will need at the very least a 
backtrace of the crash.

In case you had systemd-coredump installed, coredumpctl should show you.
Or maybe you still have a core file in /.


Regards,
Michael



OpenPGP_signature
Description: OpenPGP digital signature


Bug#1003611: Acknowledgement (systemd: Upgrade from 249.7 to 250.2 seems to have crashed the systemd root process, leaving system unstable)

2022-01-12 Thread Christian Weeks
The reboot fixed the issue - I now have a working computer again, though getting
to a reboot was a bit painful.

> systemctl reboot
Failed to reboot system via logind: Connection timed out
Failed to start reboot.target: Connection timed out
See system logs and 'systemctl status reboot.target' for details.
It is possible to perform action directly, see discussion of --force --force in
man:systemctl(1).
> systemctl reboot --force
Failed to execute operation: Failed to activate service
'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
It is possible to perform action directly, see discussion of --force --force in
man:systemctl(1).
> systemctl reboot --force --force 



On Wed, 2022-01-12 at 15:36 +, Debian Bug Tracking System wrote:
> Thank you for filing a new Bug report with Debian.
> 
> You can follow progress on this Bug here: 1003611:
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1003611.
> 
> This is an automatically generated reply to let you know your message
> has been received.
> 
> Your message is being forwarded to the package maintainers and other
> interested parties for their attention; they will reply in due course.
> 
> Your message has been sent to the package maintainer(s):
>  Debian systemd Maintainers 
> 
> If you wish to submit further information on this problem, please
> send it to 1003...@bugs.debian.org.
> 
> Please do not send mail to ow...@bugs.debian.org unless you wish
> to report a problem with the Bug-tracking system.
>