Re: [systemd-devel] RFC: temporarily deactivating udev rules during coldplug

2019-05-28 Thread Lennart Poettering
On Di, 28.05.19 12:04, Martin Wilck (mwi...@suse.de) wrote:

> We are facing problems during udev coldplug on certain very big systems
> (read: > 1000 CPUs, several TiB RAM). Basically, the systems get
> totally unresponsive immediately after coldplug starts, and remain so
> for minutes, causing uevents to time out. Attempts to track it down
> have shown that access to files on tmpfs (e.g. /run/udev/db) may take
> a very long time. Limiting the maximum number of udev workers helps,
> but doesn't seem to solve all problems we are seeing.
>
> Among the things we observed was lots of activity running certain udev
> rules which are executed for every device. One such example is the
> "vpdupdate" rule on Linux-PowerPC systems:
>
> https://sourceforge.net/p/linux-diag/libvpd/ci/master/tree/90-vpdupdate.rules

I am sorry, but this rule is bad, it hurts just looking at it. I don't
think we need to optimize our code for rules tht are that
broad. Please work with the package in question to optimize things,
and use finer grained and less ridiculous rules... (also: what for
even? to maintain a timestamp???)

> Another one is a SUSE specific rule that is run on CPU- or memory-
> events
> (https://github.com/openSUSE/kdump/blob/master/70-kdump.rules.in).
> It is triggered very often on large systems that may have 1s of
> memory devices.

This one isn't much better.

Please fix the rules to not do crazy stuff like forking off process in
gazillions of cases...

if you insist on forking of a process for every event under the sun,
then yes, things will be slow, but what can I say, you broke it, you
get to keep the pieces...

> These are rules that are worthwhile and necessary in a fully running
> system to respond to actual hotplug events, but that need not be run
> during coldplug, in particular not 1000s of times in a very short time
> span.

Sorry, but these rules are just awful, please make them finer grained,
without running shell scripts every time. I mean, your complaint is
basically: shell scripting isn't scalable... but dah, of course it
isn't, and the fix is not to do that then...

For example, in the kdump case, just pull in a singleton service
asynchronously, via SYSTEMD_WANTS for example. And if you want a
timestamp of the last device, then udev keeps that anyway for you, in
the USEC_IITIALIZED, per device.

> The idea is implemented with a simple shell script and two unit
> files.

Sorry, but we are not adding new shell scripts that work around awful
shell scripts to systemd. Please fix the actual problems, and work
with the maintainers of the packages causing those problems to fix
them, don't glue a workaround atop an ugly hack.

Sorry, but this is not an OK approach at all!

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel

[systemd-devel] RFC: temporarily deactivating udev rules during coldplug

2019-05-28 Thread Martin Wilck
We are facing problems during udev coldplug on certain very big systems
(read: > 1000 CPUs, several TiB RAM). Basically, the systems get
totally unresponsive immediately after coldplug starts, and remain so
for minutes, causing uevents to time out. Attempts to track it down
have shown that access to files on tmpfs (e.g. /run/udev/db) may take
a very long time. Limiting the maximum number of udev workers helps,
but doesn't seem to solve all problems we are seeing.

Among the things we observed was lots of activity running certain udev
rules which are executed for every device. One such example is the
"vpdupdate" rule on Linux-PowerPC systems:

https://sourceforge.net/p/linux-diag/libvpd/ci/master/tree/90-vpdupdate.rules

Another one is a SUSE specific rule that is run on CPU- or memory-
events
(https://github.com/openSUSE/kdump/blob/master/70-kdump.rules.in). 
It is triggered very often on large systems that may have 1s of 
memory devices.

These are rules that are worthwhile and necessary in a fully running
system to respond to actual hotplug events, but that need not be run
during coldplug, in particular not 1000s of times in a very short time
span. 

Therefore I'd like to propose a scheme to deactivate certain rules
during coldplug. The idea involves 2 new configuration directories:

 /etc/udev/pre-trigger.d:

   "*.rules" files in this directory are copied to /run/udev/rules.d
   before starting "udev trigger". Normally these would be 0-byte files
   with a name corresponding to an actual rule file from
   /usr/lib/udev/rules.d - by putting them to /run/udev/rules.d,
   the original rules are masked.
   After "udev settle" finishes, either successfully or not, the
   files are removed from /run/udev/rules.d again.

 /etc/udev/post-settle.d:

   "*.post" files in this directory are executed after udev settle
   finishes. The intention is to create a "cumulative action", ideally
   equivalent to having run the masked-out rules during coldplug.
   This may or may not be necessary, depending on the rules being
   masked out. For the vpdupdate rule above, thus comes down to running
   "/bin/touch /run/run.vpdupdate". 

The idea is implemented with a simple shell script and two unit files.
The 2nd unit file is necessary because simply using systemd-udev-
settle's  "ExecStartPost" doesn't work - unmasking must be done even if
"udev settle" fails or times out. "ExecStopPost" doesn't work either,
we don't want to run this when systemd-udev-settle.service is stopped
after having been started successfully.

See details below. Comments welcome.
Also, would this qualify for inclusion in the systemd code base?

Martin


Shell script: /usr/lib/udev/coldplug.sh

#! /bin/sh
PRE_DIR=/etc/udev/pre-trigger.d
POST_DIR=/etc/udev/post-settle.d
RULES_DIR=/run/udev/rules.d

[ -d "$PRE_DIR" ] || exit 0
[ -d "$RULES_DIR" ] || exit 0

case $1 in
mask)
cd "$PRE_DIR"
for fl in *.rules; do
[ -e "$fl" ] || break
cp "$fl" "$RULES_DIR"
done
;;
unmask)
cd "$PRE_DIR"
for fl in *.rules; do
[ -e "$fl" ] || break
rm -f "$RULES_DIR/$fl"
done
;;
post)
[ -d "$POST_DIR" ] || exit 0
cd "$POST_DIR"
for fl in *.post; do
[ -e "$fl" ] || break
[ -x "$fl" ] || continue
./"$fl"
done
;;
*) echo "usage: $0 [mask|unmask|post]" >&2; exit 1;;
esac



Unit file: systemd-udev-pre-coldplug.service

[Unit]
Description=Mask udev rules before coldplug
DefaultDependencies=No
Conflicts=shutdown.target
Before=systemd-udev-trigger.service
Wants=systemd-udev-post-coldplug.service
ConditionDirectoryNotEmpty=/etc/udev/pre-trigger.d
ConditionPathIsDirectory=/run/udev/rules.d
ConditionFileIsExecutable=/usr/lib/udev/coldplug.sh

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=-/usr/lib/udev/coldplug.sh mask
ExecStop=-/usr/lib/udev/coldplug.sh unmask

[Install]
WantedBy=sysinit.target



Unit file: systemd-udev-post-coldplug.service

[Unit]
Description=Reactivate udev rules after coldplug
DefaultDependencies=No
Conflicts=shutdown.target
After=systemd-udev-settle.service
ConditionDirectoryNotEmpty=/etc/udev/pre-trigger.d
ConditionPathIsDirectory=/run/udev/rules.d
ConditionFileIsExecutable=/usr/lib/udev/coldplug.sh

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=-/usr/bin/systemctl stop systemd-udev-pre-coldplug.service
ExecStart=-/usr/lib/udev/coldplug.sh post

[Install]
WantedBy=sysinit.target


___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel