Bug#808151: systemd: failed to start remount root and kernel file system
Dear Michael Biebl, you wrote: > Here is a more complete log excerpt for v228 (full log attached) > > > Dez 20 01:27:42 debian systemd[1]: -.mount: Changed dead -> mounted ... > > So the best guess is that the timing in v228 changed a little (some code > paths became faster). This would confirm Frank's findings that enabling > verbose output (which slows down boot a bit) made it less likely to hit. > > Martin has been working/debugging the tentative stuff in the past, so > maybe he has some insights here. > > We should probably also involve upstream at some point. OK, thanks for the help and (for me at least) final conclusion. For me personally the problem has been solved: for the time being I'm happy with 227, and I'm sure that the problem will soon be fixed. Thanks again for helping along! Cheers, -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA signature.asc Description: PGP signature ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
Bug#808151: systemd: failed to start remount root and kernel file system
Dear Michael Biebl, you wrote: > > This information is available at https://www.icce.rug.nl/systemd in the > > files > > initramfs.debug and alb. > > Hm, unfortunately the journal dump is incomplete again. I have no idea why Remarkable. I made it available the way I got it, so that's apparently what there is. > > booting procedure. You're sure it can't be some timing problem? > > Well, what kind of timing problem do you have in mind? Don't know: I didn't design systemd. But if it's doing things in parallel then maybe on newer, faster, computers things might have completed, like remounting /usr rw before it's actually used. A race condition might then explain why the problem doesn't always show itself, and why chances of failure are reduced when more time is spent writing debug/verbose messages. > So far, the only thing I can say for sure looking at the initramfs log, > is that /usr has been mounted successfully in the initramfs. > > "Something" apparently causes /usr to be unmounted later on. Which part > and why that is, is not clear yet. > > Do you have any (custom) init scripts in /etc/rcS.d/ which fiddle around > with mount settings, run telinit or stuff like that? Nope. > I'm running out of ideas, tbh. Well, that's already a *lot* more than I could offer myself :-) But fortunately (for me, but hard to fix, I realize), the problem doesn't emerge all the time. If rebooting every now and then gets me a running system, then so be it. The FailureAction=reboot-force entry in systemd-remount-fs.service already has proven to be my friend :-) > If you suspect the remount service to be the cause for this, let's > output the mounts before and after > For that run > $ systemctl edit systemd-remount-fs.service When I issue that command I get the reply Warning: systemd-remount-fs.service changed on disk. Run 'systemctl daemon-reload' to reload units. I guess the warning is obvious as I edited the file /lib/systemd/system/local-fs.target.wants/systemd-remount-fs.service to prevent the reboot action. So I did as advised and reran the command, but got an empty file in my editor, while the following message was shown after ending the editor: Editing "/etc/systemd/system/systemd-remount-fs.service.d/override.conf" canceled: temporary file is empty. > Then add > [Service] > ExecStartPre=/bin/sh -c 'echo "before rootfs remount"; findmnt' > ExecStartPost=/bin/sh -c 'echo "after rootfs remount"; findmnt' > > Reboot and attach the journal log again. Instead of running the systemctl command I edited the file /lib/systemd/system/local-fs.target.wants/systemd-remount-fs.service and added the lines you suggested. My next e-mail is about the contents of journal log. Thereafter I'll try to downgrade to the previous version to see what happens then. -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA signature.asc Description: PGP signature ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
Bug#808151: systemd: failed to start remount root and kernel file system
Hi Michael, I downgraded to the following versions of the following packages: libpam-systemd_227-2_i386.deb libudev1_227-2_i386.deb libsystemd-dev_227-2_i386.deb systemd-sysv_227-2_i386.deb libsystemd0_227-2_i386.deb systemd_227-2_i386.deb libudev-dev_227-2_i386.deb udev_227-2_i386.deb Thereafter I rebooted several times without encountering any problems. Also with reduced output (grub's option 'quiet') no problems were encountered. Cheers, -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA signature.asc Description: PGP signature ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
Bug#808151: systemd: failed to start remount root and kernel file system
Dear Michael Biebl, you wrote: > Am 18.12.2015 um 15:59 schrieb Frank B. Brokken: > > Is there a way to determine that? What I do to upgrade the system is run > > 'aptitude update' and then 'aptitude upgrade'. Is there a log somewhere that > > tells me what packages and versions were updated at what moments in time? > > /var/log/dpkg.log is a low-level log. > > and then there is one for aptitude at /var/log/aptitude Thanks: I made the logs available at https://www.icce.rug.nl/systemd > ... > Btw, you mentioned that this happened after an upgrade. Which previous > version did you run which worked fine? Which other packages were > upgraded along with it? Tue, Dec 1 2015 14:07:23 +0100: the aptitude log shows an upgrade from systemd 227-2 to 228-2 dpkg log: 2015-12-01 14:08:42 upgrade systemd:i386 227-2 228-2 dpkg log: 2015-12-03 08:30:01 upgrade systemd:i386 228-2 215-17+deb8u2 Thu, Dec 3 2015 08:31:37 +0100 the aptitude log shows an upgrade from systemd 215-17+deb8u2 -> 228-2 dpkg log: 2015-12-03 08:31:40 upgrade systemd:i386 215-17+deb8u2 228-2 and then, recently, by me trying to downgrade: dpkg log: 2015-12-17 12:59:12 upgrade systemd:i386 228-2 228-2 dpkg log: 2015-12-18 16:15:37 upgrade systemd:i386 228-2 215-17+deb8u2 dpkg log: 2015-12-18 16:17:11 upgrade systemd:i386 215-17+deb8u2 228-2 Before Dec 1 no updates were recorded for systemd or udev, and until the upgrades early December everything ran fine. > If you downgrade systemd/udev, does the problem go away? As I feared, downgrading is difficult because of the many reverse dependencies. I looked at ftp://ftp.de.debian.org/debian/pool/main/s/systemd/ which is the mirror I usually use for earlier .deb files, but the one before 228-2 is 215-17, and 227-2 is only available as source archives and not AFAICS as .deb packages. I'll add the debug entry next (cf. your mail from Date: Fri, 18 Dec 2015 03:15:15 +0100) and let you know the results. -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA signature.asc Description: PGP signature ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
Bug#808151: systemd: failed to start remount root and kernel file system
Dear Michael Biebl, you wrote: > Well, /usr is mounted by the initramfs these days. So it should already > be available when systemd is started. If that fails, this is a bug which > needs to be addressed by initramfs-tools (or one of the hook scripts). > It wasn't clear so far that /usr hasn't been mounted at all. > > Is /usr on LVM, RAID, etc? No, nothing like that. And for what it's worth: the problem only appeared after I upgraded systemd last week. The laptop has nothing special in its setup, and has been working perfectly for years, until last week when systemd was renwed. I think in my bugreport I mentioned the problem that /usr wasn't mounted. In your next reply you wrote: > I'm a bit confused by those logs. They show that sda5 have been mounted. > > Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounting > ext3 file system using the ext4 subsystem > Dec 17 15:44:29 localhost.localdomain kernel: EXT4-fs (sda5): mounted > filesystem with ordered data mode. Opts: (null) > > I figure /dev/sda5 is your /usr partition? Just to be sure, please > attach ls -la /dev/disk/by-uuid/ I seem to remember that message, in particular the Opts: (null) remark, and I think at that point /usr was mounted by me fron the systemd shell. Also, couldn't it be that initramfs *did* do the mount, but that remounting it rw, als reported in the error message is the problem? Also, to me it appears remarkable that by removing the 'quiet' from the kernel parameters, so that things go a bit slower because of the extra messages that are displayed the frequency of failing boot procedures is greatly diminished. I'm considering trying to add 'verbose' to grub's parameters to see if that produces more output and maybe further reduces the frequency, but I haven't had the time to do that yet. Something on the TODO list :-) Anyway, here's the ls -la output: total 0 drwxr-xr-x 2 root root 200 Dec 18 13:05 ./ drwxr-xr-x 5 root root 100 Dec 18 13:02 ../ lrwxrwxrwx 1 root root 10 Dec 18 13:02 04b82e8b-f871-4abb-978a-44ae44c5d1f7 -> ../../sda1 lrwxrwxrwx 1 root root 10 Dec 18 13:02 595bcdbf-6436-45a7-99d2-297a3dd85930 -> ../../sda6 lrwxrwxrwx 1 root root 10 Dec 18 13:02 693c71eb-d411-4ee0-a1b3-c577df02e01b -> ../../sda9 lrwxrwxrwx 1 root root 10 Dec 18 13:02 6bcb2a05-33c9-402b-8093-e6a35ffd7aa1 -> ../../sda8 lrwxrwxrwx 1 root root 11 Dec 18 13:05 82e52787-6072-4af9-a5e6-2d88c365e62b -> ../../loop0 lrwxrwxrwx 1 root root 10 Dec 18 13:02 c5591eff-0a6c-4310-bb11-7d5535f7da7b -> ../../sda7 lrwxrwxrwx 1 root root 10 Dec 18 13:05 e289e4ad-be1d-42a8-9b38-f4dad9473520 -> ../../dm-0 lrwxrwxrwx 1 root root 10 Dec 18 13:02 ea8202e7-4564-424c-af70-a6a640fafb65 -> ../../sda5 ~ I'll do the 'debug' addition later this weekend, like you requested. Finally, you asked: > Do you have any custom udev rules in /etc/udev/rules.d? I don't think so, looking at the time stamps nothing has been changed there for years: total 10 drwxr-xr-x 2 root root 3072 Dec 6 2014 ./ drwxr-xr-x 4 root root 1024 Dec 3 08:34 ../ -rw-r--r-- 1 root root 115 Dec 6 2014 70-automount.rules -rw-r--r-- 1 root root 3841 Dec 6 2014 70-persistent-cd.rules -rw-r--r-- 1 root root 895 Feb 26 2013 70-persistent-net.rules And I definitely didn't recently change there any files, so again: the problem appeared out of the blue since last weeks upgrade. I hope the above gives you at least some additional info. As I wrote: I'll do the 'debug' addition tomorrow. Cheers, -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
Bug#808151: systemd: failed to start remount root and kernel file system
Dear Michael Biebl, you wrote: > The verbose debug log from the initramfs and systemd can maybe tell us > more. But looking at https://www.icce.rug.nl/systemd/journalctl, the > sda5 mount happens at line 773, the first errors start at line 785 and > the remount is at line 802. > So it looks like /usr is not mounted at the time > systemd-remount-fs.service is run. Right. That's consistent with the impression I got from the error messages. *Why* that is true, however, eludes me. > > What's also curious is, that the log doesn't seem to be complete. > Usually systemd's first log line is something like > > > Dez 18 07:03:47 pluto systemd[1]: systemd 228 running in system mode. (+PAM > > +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP > > +GCRYPT +GNUTLS +ACL +X > > Dez 18 07:03:47 pluto systemd[1]: Detected architecture x86-64. > > Those early boot messages seem to be missing completely. Well, I didn't edit anything. The information I generated is passed to you the way it was made available by the various programs/commands. > Btw, you mentioned that this happened after an upgrade. Which previous > version did you run which worked fine? Which other packages were > upgraded along with it? Is there a way to determine that? What I do to upgrade the system is run 'aptitude update' and then 'aptitude upgrade'. Is there a log somewhere that tells me what packages and versions were updated at what moments in time? > If you downgrade systemd/udev, does the problem go away? I thought about doing that, but was afraid for an avalanche of forced downgrades of packages that might now depend on the most recent udev and systemd versions. But I'll give it a try asap and let you know the results. -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
Bug#808151: systemd: failed to start remount root and kernel file system
Dear Michael Biebl, you wrote: > What happens afterwards? Are you dropped into the rescue shell? Afterwards (i.e., after the initial failure message) the system tries to continue booting, but shows lots of failure messages, eventually grinding to a halt. No reboot (e.g. ctrl-alt-del) is possible and there's no rescue shell. > If not, try to enable the debug shell by adding "systemd.debug-shell" to > the kernel command line. This will give you a root shell on tty9. Unfortunately, it doesn't, since the system halts (I first removed the automatic reboot on failure). However, during the process I observed that setting systemd.debug-shell and removing the default 'quiet' specification from grub's /etc/default/grub (so, now it specifies: GRUB_CMDLINE_LINUX_DEFAULT="systemd.debug-shell" greatly reduces the chances of a failing boot. Not completely, but greatly. Still, when rebooting fails there's just the plain halt, w/o a debug shell. Since removing the quiet also produces a lot more output on the screen, might my problem not simply be some timing problem? -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA signature.asc Description: PGP signature ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers
Bug#808151: systemd: failed to start remount root and kernel file system
Dear Michael Biebl, you wrote: > Am 17.12.2015 um 13:46 schrieb Frank B. Brokken: > > halt. No reboot (e.g. ctrl-alt-del) is possible and there's no rescue > > shell> > What exactly do you mean with halt? The systems completely locks up so > you can't use the keyboard and switch to tty9? No, that's not what happens. I mean that doing a reboot using ctrl-alt-del isn't possible. Switching VTs is no problem, but except for VT1 nothing was being shown. But maybe I overlooked things when I sent you the previous reply: see below. > That would sound like a kernel problem. > > might my problem not simply be some timing problem? > > Can you make a screenshot or a video from the boot process with "quiet" > removed from the kernel command line. I did. Not only that, I also tried to reboot again and this time I was able to run the commands you asked before from tty9: systemctl status systemd-analyze dump journalctl -alb This time the debug shell prompt was available at tty9, although booting failed. And in line with my previous findings, systemd-analyze and journalctl weren't available, as they live in /usr/bin, and /usr hadn't been mounted. But after mounting /usr from tty9 and then using the mount command systemd-analyze and journalctl were available, so I also have the output from those commands for you. The output, and the mp4 movie I made during the booting process are a bit too large for the e-mail, but they are available for download/inspection at https://www.icce.rug.nl/systemd/ Cheers, -- Frank B. Brokken Center for Information Technology, University of Groningen (+31) 50 363 9281 Public PGP key: http://pgp.surfnet.nl Key Fingerprint: DF32 13DE B156 7732 E65E 3B4D 7DB2 A8BE EAE4 D8AA signature.asc Description: PGP signature ___ Pkg-systemd-maintainers mailing list Pkg-systemd-maintainers@lists.alioth.debian.org http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-systemd-maintainers