This patch fixes a long-standing zombie-process leak in crond that
appears whenever crond itself is PID 1 (typical in minimal BusyBox
containers).

Reproducer
==========

   mkdir /tmp/test-root-crontabs -p
   echo "* * * * * sh -c 'sleep 1 &'" > "/tmp/test-root-crontabs/root"
   sudo chown root:root -R /tmp/test-root-crontabs

   ## Run busybox crond PID 1
   docker run -d --rm --name busybox \
          -v /tmp/test-root-crontabs/root:/var/spool/cron/crontabs/root:ro \
          busybox:1.37.0-uclibc \
          crond -f -d 0 > /dev/null

   watch -n 1 '
     echo "Expecting more zombies in $((60 - 10#$(date +%S)))s (Ctrl-C to 
quit):"
     ps -ax -o pid,ppid,stat,comm | egrep "\s+Z\s+"
   '
   echo "Cleaning busybox container"
   docker stop busybox


You should see one new "sleep" zombie each minute.

If crond is *not* PID 1 (e.g. started via a wrapper shell), the leak
does not appear because the wrapper—now PID 1—reaps the orphans.


Root cause
==========

crond only calls `waitpid()` for PIDs it tracks.  Background tasks
(`sleep 5 &`) become orphans; the kernel re-parents them to PID 1.
When crond *is* PID 1, it inherits these orphans but never waits for
them, so they persist as zombies.


Fix
===

Add a tiny reaper loop:

     while (waitpid(-1, NULL, WNOHANG) > 0);

Placed at the end of `check_completions()`, it collects any remaining
children.  On systems where crond is **not** PID 1 the loop returns
`-ECHILD` immediately, so behaviour and overhead are unchanged.


Testing
=======

* Reproduced the leak on `busybox:1.37.0-uclibc` and current git master.
* Applied the patch, rebuilt BusyBox statically (with only crond),
  bind-mounted the binary into a fresh container; zombie count stays at
  0 after X min.

  Suggested docker commmand for testing with compiled (static) binary in CWD:

    docker run --rm --name busybox \
        -v /tmp/test-root-crontabs/root:/var/spool/cron/crontabs/root:ro \
        -v $PWD/busybox:/bin/crond \
        busybox:1.37.0-uclibc \
        crond -f -d 0

* Verified crond still exits cleanly and runs scheduled jobs normally.

Binary size impact (gcc 13.3.0, x86-64, `-Os`, static): +17 bytes.

Thanks for your time and for maintaining BusyBox!

Regards,
Valentin Lab
>From 6bd536d80448485ebb5c12cb032286e97ddacb2a Mon Sep 17 00:00:00 2001
From: Valentin Lab <[email protected]>
Date: Sat, 31 May 2025 09:56:09 +0800
Subject: [PATCH] crond: reap orphaned grandchildren to prevent zombie buildup

If a cron job launches a background task, e.g. `sh -c "sleep 5 &"`,
the shell exits immediately and the `sleep` process is re-parented to
PID 1.  When BusyBox `crond` itself happens to be PID 1 (common in a
minimal container), those orphans become direct children of `crond`.
Because `crond` only calls waitpid() for the PIDs it explicitly tracks,
these processes remain forever in Z state and the container slowly
fills with zombies.

Add a small `while (waitpid(-1, NULL, WNOHANG) > 0)` sweep at the end
of check_completions() so any stray children are reaped.  When `crond`
is not PID 1 the loop returns -ECHILD immediately, so behaviour and
overhead on a normal system are unchanged.

Size impact: +12 bytes on x86-64 (gcc 13.3.0 -Os, static)

Signed-off-by: Valentin Lab <[email protected]>
---
 miscutils/crond.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/miscutils/crond.c b/miscutils/crond.c
index b3762d327..43d83b474 100644
--- a/miscutils/crond.c
+++ b/miscutils/crond.c
@@ -1001,6 +1001,10 @@ static int check_completions(void)
 			/* else: r == 0: "process is still running" */
 			file->cf_has_running = 1;
 		}
+
+		/* Reap any other children we don't actively track */
+		while (waitpid(-1, NULL, WNOHANG) > 0);
+
 //FIXME: if !file->cf_has_running && file->deleted: delete it!
 //otherwise deleted entries will stay forever, right?
 		num_still_running += file->cf_has_running;
-- 
2.43.0

_______________________________________________
busybox mailing list
[email protected]
https://lists.busybox.net/mailman/listinfo/busybox

Reply via email to