I found this issue on my  esoteric NOMMU uclinux system (32 bit arm with no
MMU using fdpic complier/libraries). Nobody's going to replicate that
system, but the good news is the problem can be demonstrated on a
vanilla X86-64 linux system (I'm using Ubuntu 24.04.2).

So the big picture is that the simple command line:

while true; do  v=$(cat /dev/null); done;

will demonstrate the problem. It doesn't matter if "cat" is a busybox
built-in or not, and any other program will have the same issue, e.g.
v=$(echo "no way")

Running that on the hush command line (interactive) while looking at top
for the process in another window looks like (deleting the various copies
of the header):

top - 21:02:04 up 1 day,  5:32,  1 user,  load average: 0.09, 0.41, 0.41
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  0.2 sy,  0.0 ni, 97.9 id,  0.0 wa,  0.0 hi,  1.7 si,
 0.0 st
MiB Mem :   3915.9 total,    314.8 free,   2066.2 used,   1386.0 buff/cache

MiB Swap:   3914.0 total,   3801.7 free,    112.3 used.   1849.7 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
COMMAND
 945506 harry     20   0    4408   2216   2088 S   0.0   0.1   0:00.00 hush
 945506 harry     20   0    4408   2344   2216 R   1.0   0.1   0:00.03 hush
 945506 harry     20   0    4536   2344   2216 S   3.7   0.1   0:00.14 hush
 945506 harry     20   0    4664   2472   2216 S   3.7   0.1   0:00.25 hush
 945506 harry     20   0    4792   2728   2216 S   3.3   0.1   0:00.35 hush
 945506 harry     20   0    4920   2728   2216 S   4.0   0.1   0:00.47 hush
 945506 harry     20   0    5048   2856   2216 S   3.3   0.1   0:00.57 hush
 945506 harry     20   0    5304   3112   2216 S   4.0   0.1   0:00.69 hush
 945506 harry     20   0    5432   3240   2216 R   3.3   0.1   0:00.79 hush
 945506 harry     20   0    5560   3240   2216 R   3.3   0.1   0:00.89 hush
 945506 harry     20   0    5688   3368   2216 S   2.7   0.1   0:00.97 hush
 945506 harry     20   0    5688   3368   2216 S   0.0   0.1   0:00.97 hush
 945506 harry     20   0    5688   3368   2216 S   0.0   0.1   0:00.97 hush

Analyzing the top output, before I typed the script, the memory usage was
steady and it wasn't using any CPU. While it was running, it used CPU and
the memory usage went up. Then I stopped the script with ctrl-C, and the
CPU usage stopped, but the memory wasn't freed. Sure, v still exists so
some teeny tiny additional memory (too small to see with top's granularity)
is still there, but the giant growth didn't disappear. Digging into the
proc filesystem shows that this memory growth is in the heap as expected.

Curiously, if the loop is instead "for x in {1..1000000};" it doesn't seem
to leak. Also, you can execute pretty much anything inside the infinite
loop without problems except for any construct that grabs the stdout of a
program, i.e. =$(program) or =`program`. Running a program and examining
its exit return integer has no leak. Infinite loops matter because my real
script does a lot of things and then sleeps for 2 seconds and I need it to
always run. That sleep slows down the runaway memory consumption, but
eventually it's doomed. Adding a sleep() to the loop can be helpful when
trying to debug this problem.

Attached are several things: my busybox config file, my mods to hush.c (the
only necessary change to demonstrate the leak is to set BUILD_AS_NOMMU to
1). My other changes greatly improve the poor LEAK_HUNTING previously
provided, plus a new script to process the output. The old functions and
script had a lot of drawbacks, including ignoring xasprintf(), and not
really matching allocations with frees.: It is extremely common that a call
to malloc() that is then freed, and then a similar sized malloc() is later
requested, gives you the same exact pointer for the new malloc, but the old
leak hunting code and script would remove all occurances of any memory
allocated at the same pointer from just one single free(), potentially
hiding leaks. Also the old system didn't handle realloc very well because
if realloc() doesn't return the original pointer then the realloc() call
did the freeing of the old pointer with no free() being logged and no way
to know what pointer was passed to realloc(). My version fixes these
problems and also reports on (non-NULL) free() calls for pointers that
never appeared in the allocated list. There is still a modest amount of
those generally, which also demonstrates that there are some alloc's that
still aren't being tracked.

So I haven't figured out how to fix the problem in hush, but I do know that
at least, for each iteration of the loop, there are several realloc() calls
on line 3280 ( in o_save_ptr_helper() ), and one on line 3041 ( in
o_grow_by() ) the line numbers are from my patched file. But I can;t figure
out when/how they should be freed and there may be additional leaks too.
I'm really hoping that the clues I've provided will allow a hush maintainer
to quickly figure this out, because I've been trying for a while without
success.

Another clue is that the leak allocation is probably inside the vfork exec
because I also ran busybox's hush in valgrind using massif and it didn't
even see the memory growth that top was showing. But it also was constantly
complaining about the program (cat in the original example) not being
launched within valgrind, so I think its allocations weren't being seen by
valgrind.

Thank you for any help and the fantastic product that is busybox.
harry

Attachment: leaktool.sh
Description: application/shellscript

Attachment: my_busybox_.config
Description: Binary data

Attachment: patch
Description: Binary data

_______________________________________________
busybox mailing list
[email protected]
https://lists.busybox.net/mailman/listinfo/busybox

Reply via email to