Your message dated Fri, 29 Jan 2016 20:29:28 +0100
with message-id <[email protected]>
and subject line Re: Bug#761257: systemd: disrupts hugepages support
has caused the Debian Bug report #761257,
regarding Work around kernel bug with mounting hugetlbfs twice?
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact [email protected]
immediately.)
--
761257: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=761257
Debian Bug Tracking System
Contact [email protected] with problems
--- Begin Message ---
Package: systemd
Version: 208-8
Severity: normal
Dear Maintainer,
We are developing Intel DPDK applications on several Debian-powered
servers. Those applications make use of 1GB huge pages, allocated
through kernel parameters in /etc/defaults/grub, and an entry in
/etc/fstab to mount them in /mnt/huge_1GB.
After some upgrades, our applications stopped working on most servers
due to hugepages being unavailable. They were still appearing in
/proc/meminfo but were neither free, nor reserved.
After hours of debugging (we had also updated Intel DPDK so we thought
that was the culprit), we noticed a difference between the failing
servers and the ones still working was that systemd was running as init
on the failing ones and not on the working ones.
We tried uninstalling systemd-sysv (installing sysvinit-core and
systemd-shim), rebooting, and then it worked as before.
After investigations, it looks like systemd, when run as init, mounts
the hugepages in /dev/hugepages (IMHO, an unexpected place for a mount
point), before them being remounted on /mnt/huge_1GB as per fstab. It
looks like hugepages won't work when mounted twice.
A likely culprit is /lib/systemd/system/dev-hugepages.mount.
The workaround looks trivial (remove fstab entry and link /dev/hugepages
to /mnt/huge_1GB), but I still have the feeling this should not have
happened in the first place, hence this bug report.
I would expect either (by order of preference):
1) systemd *not* messing with the existing hugepages setup;
2) being warned when installing systemd-sysv that systemd handles
hugepages differently (especially when I have hugepages entries in my
fstab).
-- Package-specific info:
-- System Information:
Debian Release: jessie/sid
APT prefers testing-updates
APT policy: (500, 'testing-updates'), (500, 'testing'), (500, 'stable')
Architecture: amd64 (x86_64)
Kernel: Linux 3.14-2-amd64 (SMP w/12 CPU cores)
Locale: LANG=en_AU.UTF-8, LC_CTYPE=en_AU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages systemd depends on:
ii acl 2.2.52-1.1
ii adduser 3.113+nmu3
ii initscripts 2.88dsf-53.4
ii libacl1 2.2.52-1.1
ii libaudit1 1:2.3.7-1
ii libblkid1 2.20.1-5.8
ii libc6 2.19-10
ii libcap2 1:2.24-4
ii libcap2-bin 1:2.24-4
ii libcryptsetup4 2:1.6.6-1
ii libdbus-1-3 1.8.6-2
ii libgcrypt11 1.5.4-3
ii libkmod2 18-1
ii liblzma5 5.1.1alpha+20120614-2
ii libpam0g 1.1.8-3.1
ii libselinux1 2.3-2
ii libsystemd-daemon0 208-8
ii libsystemd-journal0 208-8
ii libsystemd-login0 208-8
ii libudev1 208-8
ii libwrap0 7.6.q-25
ii sysv-rc 2.88dsf-53.4
ii udev 208-8
ii util-linux 2.20.1-5.8
Versions of packages systemd recommends:
ii libpam-systemd 208-8
Versions of packages systemd suggests:
pn systemd-ui <none>
-- no debconf information
0 overridden configuration files found.
==>
/var/lib/systemd/deb-systemd-helper-enabled/dbus-org.freedesktop.nm-dispatcher.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/bluetooth.target.wants/bluetooth.service
<==
==> /var/lib/systemd/deb-systemd-helper-enabled/dbus-org.bluez.service <==
==> /var/lib/systemd/deb-systemd-helper-enabled/acpid.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/acpid.service
==> /var/lib/systemd/deb-systemd-helper-enabled/avahi-daemon.socket.dsh-also <==
/etc/systemd/system/sockets.target.wants/avahi-daemon.socket
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/atd.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/anacron.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/cron.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/avahi-daemon.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/rsyslog.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/ssh.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/binfmt-support.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/lm-sensors.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/pppd-dns.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/NetworkManager.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/multi-user.target.wants/ModemManager.service
<==
==> /var/lib/systemd/deb-systemd-helper-enabled/lvm2-lvmetad.socket.dsh-also <==
/etc/systemd/system/sockets.target.wants/lvm2-lvmetad.socket
/etc/systemd/system/sysinit.target.wants/lvm2-lvmetad.socket
==> /var/lib/systemd/deb-systemd-helper-enabled/lm-sensors.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/lm-sensors.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/sysinit.target.wants/lvm2-lvmetad.socket
<==
==> /var/lib/systemd/deb-systemd-helper-enabled/binfmt-support.service.dsh-also
<==
/etc/systemd/system/multi-user.target.wants/binfmt-support.service
==> /var/lib/systemd/deb-systemd-helper-enabled/pppd-dns.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/pppd-dns.service
==> /var/lib/systemd/deb-systemd-helper-enabled/cgproxy.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/cgproxy.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/lvm2-activation.service.dsh-also <==
/etc/systemd/system/local-fs.target.wants/lvm2-activation.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/sockets.target.wants/lvm2-lvmetad.socket
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/sockets.target.wants/avahi-daemon.socket
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/sockets.target.wants/acpid.socket
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/accounts-daemon.service.dsh-also <==
/etc/systemd/system/graphical.target.wants/accounts-daemon.service
==> /var/lib/systemd/deb-systemd-helper-enabled/sshd.service <==
==> /var/lib/systemd/deb-systemd-helper-enabled/NetworkManager.service.dsh-also
<==
/etc/systemd/system/multi-user.target.wants/NetworkManager.service
/etc/systemd/system/dbus-org.freedesktop.NetworkManager.service
/etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/dbus-org.freedesktop.NetworkManager.service
<==
==> /var/lib/systemd/deb-systemd-helper-enabled/bluetooth.service.dsh-also <==
/etc/systemd/system/bluetooth.target.wants/bluetooth.service
/etc/systemd/system/dbus-org.bluez.service
==> /var/lib/systemd/deb-systemd-helper-enabled/anacron.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/anacron.service
==> /var/lib/systemd/deb-systemd-helper-enabled/ssh.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/ssh.service
/etc/systemd/system/sshd.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/NetworkManager-wait-online.service.dsh-also
<==
/etc/systemd/system/multi-user.target.wants/NetworkManager-wait-online.service
==> /var/lib/systemd/deb-systemd-helper-enabled/syslog.service <==
==>
/var/lib/systemd/deb-systemd-helper-enabled/local-fs.target.wants/lvm2-activation-early.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/local-fs.target.wants/lvm2-activation.service
<==
==> /var/lib/systemd/deb-systemd-helper-enabled/atd.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/atd.service
==> /var/lib/systemd/deb-systemd-helper-enabled/ModemManager.service.dsh-also
<==
/etc/systemd/system/multi-user.target.wants/ModemManager.service
/etc/systemd/system/dbus-org.freedesktop.ModemManager1.service
==> /var/lib/systemd/deb-systemd-helper-enabled/rsyslog.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/rsyslog.service
/etc/systemd/system/syslog.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/graphical.target.wants/accounts-daemon.service
<==
==> /var/lib/systemd/deb-systemd-helper-enabled/cron.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/cron.service
==> /var/lib/systemd/deb-systemd-helper-enabled/acpid.socket.dsh-also <==
/etc/systemd/system/sockets.target.wants/acpid.socket
==> /var/lib/systemd/deb-systemd-helper-enabled/avahi-daemon.service.dsh-also
<==
/etc/systemd/system/multi-user.target.wants/avahi-daemon.service
/etc/systemd/system/sockets.target.wants/avahi-daemon.socket
/etc/systemd/system/dbus-org.freedesktop.Avahi.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/NetworkManager-dispatcher.service.dsh-also
<==
/etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service
==> /var/lib/systemd/deb-systemd-helper-enabled/ssh.socket.dsh-also <==
/etc/systemd/system/sockets.target.wants/ssh.socket
==> /var/lib/systemd/deb-systemd-helper-enabled/cgmanager.service.dsh-also <==
/etc/systemd/system/multi-user.target.wants/cgmanager.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/dbus-org.freedesktop.Avahi.service
<==
==>
/var/lib/systemd/deb-systemd-helper-enabled/lvm2-activation-early.service.dsh-also
<==
/etc/systemd/system/local-fs.target.wants/lvm2-activation-early.service
==>
/var/lib/systemd/deb-systemd-helper-enabled/dbus-org.freedesktop.ModemManager1.service
<==
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point> <type> <options> <dump> <pass>
/dev/mapper/group-system / ext4 errors=remount-ro 0 1
# /boot was on /dev/sdb1 during installation
UUID=fbe616bc-6539-400a-adeb-a8f61efd804d /boot ext4 defaults
0 2
# /boot/efi was on /dev/sda1 during installation
UUID=8BAA-2600 /boot/efi vfat utf8 0 0
/dev/mapper/frodo-home /home ext4 defaults 0 2
/dev/mapper/frodo-swap none swap sw 0 0
/dev/sr0 /media/cdrom0 udf,iso9660 user,noauto 0 0
nodev /mnt/huge_1GB hugetlbfs pagesize=1GB 0 0
--- End Message ---
--- Begin Message ---
Hi Cyril
On Sat, 13 Sep 2014 14:31:45 +0200 Cyril Soldani
<[email protected]> wrote:
> On Fri, 12 Sep 2014 19:25:21 +0200
> [email protected] (Marco d'Itri) wrote:
> > > 1) systemd *not* messing with the existing hugepages setup;
> >
> > This will not happen: it would be too much complex and anyway the new
> > "standard" location is /dev/hugepages/ .
>
> It looks questionable to me, but you certainly know better (as I
> don't know anything about it) and I won't argue any further.
>
> > > 2) being warned when installing systemd-sysv that systemd handles
> > > hugepages differently (especially when I have hugepages entries
> > > in my fstab).
> >
> > But I think that we can add a preinst check, can you provide a simple
> > shell test case that checks for this condition?
>
> I must first mention that the problem is less severe than initially
> thought. As Ben Hutchings helped me discover (see #761299), you can
> still use hugepages even when mounted several times. Our problem was
> that the two mount points had different permissions, and our
> applications were using the wrong one. It thus likely to affect less
> users than initially thought.
>
> If you are still willing to add a warning (which could still be nice,
> IMO), a test like this might be sufficient:
After more consideration, I decided to close this bug report:
We released jessie almost a year ago without adding such a check to the
maintainer scripts. So far we didn't have another user reporting the
same issue. So I assume it is very rarely used.
And as Ben pointed out, having hugetlbfs mounted a second time should
actually be no problem.
Adding such a preinst check after the fact will only add complexity to
our maintainer scripts for very little gain, I fear.
If you don't want to use the default dev-hugepages.mount unit, you can
either mask it (via systemctl mask dev-hugepages.mount) and continue to
use the fstab entry, or you copy that file to /etc/systemd/system/ and
edit it to your likings.
Regards,
Michael
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
signature.asc
Description: OpenPGP digital signature
--- End Message ---