I left my Guix System-based web server running for 26 days and PID 1 has
ballooned to consume 75% of all available RAM.  Because of this, it can
no longer fork. Which, in turn, means the system is almost but not quite
dead in the water.  Daemons that are already running, such as the actual
web server, are fine, but any transient service -- like ssh -- won't
start.  I could log in on the console, because getty was already
running, but `reboot` just hangs, and if I log out I expect it won't be
able to start another getty process.

Here is some relevant troubleshooting info:

# uptime
19:08:57  up 26 days 20:17,  1 user,  load average: 0.01, 0.02, 0.00

# free
               total        used        free      shared  buff/cache   available
Mem:         2020468     1768960      103008        6472      307064      251508
Swap:        2094056      168268     1925788

# ps -p 1 lc
F   UID     PID    PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
4     0       1       0  20   0 1988980 1528612 do_epo Sl ?       175:14 shepher

# grep -v MARK messages
2025-09-14 22:00:48 localhost shepherd[1]: Rotating '/var/log/messages' to 
'/var/log/messages.1'.
2025-09-14 22:00:48 localhost linux: [1638517.256304] __vm_enough_memory: pid: 
1, comm: shepherd, bytes: 8388608 not enough memory for the allocation
2025-09-14 22:00:48 localhost shepherd[1]: Exception caught while calling 
action of timer 'log-rotation': (system-error "primitive-fork" "~A" ("Cannot 
allocate memory") (12))
2025-09-22 19:06:33 localhost shepherd[1]: Stopping service root...
2025-09-22 19:06:33 localhost shepherd[1]: Exiting shepherd...
2025-09-22 19:06:33 localhost shepherd[1]: Service guix-ownership is not 
running.
2025-09-22 19:06:33 localhost shepherd[1]: Service user-homes is not running.
2025-09-22 19:06:33 localhost shepherd[1]: Stopping service 
swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f...
2025-09-22 19:06:33 localhost linux: [2319321.058327] __vm_enough_memory: pid: 
1, comm: shepherd, bytes: 2144313344 not enough memory for the allocation
2025-09-22 19:06:33 localhost shepherd[1]: Ignoring error while stopping 
swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f: (system-error "swapoff" "~S: ~A" 
("/dev/vda2" "Cannot allocate memory") (12))
2025-09-22 19:06:33 localhost shepherd[1]: Service 
swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f might have failed to stop.
2025-09-22 19:06:33 localhost shepherd[1]: Service 
swap-7cb6821e-5fbb-48b1-85f8-74b4c41e9b7f is now stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Stopping service ntpd...
2025-09-22 19:06:34 localhost ntpd[134]: ntpd exiting on signal 15 (Terminated)
2025-09-22 19:06:34 localhost shepherd[1]: Service ntpd stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Service ntpd is now stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Stopping service ssh-daemon...
2025-09-22 19:06:34 localhost shepherd[1]: Service ssh-daemon stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Service ssh-daemon is now stopped.
2025-09-22 19:06:34 localhost shepherd[1]: Stopping service 
certbot-certificate-renewal...

--

Closely related issue: For situations just such as this, reboot(8) is
supposed to have an option (conventionally `-f/--force`) which causes it
to issue the reboot system call itself, bypassing init.  But the
Shepherd's version of reboot is missing this option.

--

I was already pretty frustrated with Guix System and this memory leak is
the last straw.  This server is shortly going to be reformatted with
another distribution.  However, I will preserve a disk image in case it
is useful to anyone.

zw



Reply via email to