On Thu, 30 Mar 2006, Claus Assmann wrote:

Is there some "simple" way to find a memory leak in some OS supplied
library? I have a (constantly running) application that grows in a
week from 5MB to 15MB in size (VSZ and RSS as reported by ps). The
application can be compiled with an optional debugging memory
allocator that tracks all (de)allocations to check whether any of
its malloc()/free() calls leak memory; according to that tool the
application behaves fine.  Hence I'm wondering whether there is a
memory leak in some library or the OS, which also could be triggered
by the way my application uses it (see the recent thread about
telldir()/seekdir()).

The approach that I used with smbd to find the telldir()/seekdir()
leaks wasn't "simple", and it did involve some patience and trial
and error, but I'm not sure it was particularly complicated either.

Basically, I recompiled smbd to link with the dmalloc library which
is designed to augment the system malloc()-type calls. The samba
devel team thoughtfully includes developer code to support this -- I
simply needed to recompile the app and turn it on. Then I ran the
program with the test case that caused it to leak memory and watched
the dmalloc output as smbd exitted. This confirmed the leak; dmalloc
catches leaks even if the underlying leaking malloc() is buried in a
system call -- or at least it did in this case.

Unfortunately, dmalloc only reported a stack return address to
identify the culprit, so I ended up having to trace through the code
and narrow the issue using dmalloc mark and reporting calls. If this
is your own code, then this should be significantly easier than
tracing samba code -- but I had to go through this step to
understand what part of samba was leaking memory. At this
point, my assumption was that it was something odd that samba was
doing for my particular install. Eventually, once I'd narrowed the
leaking code in samba, I ended up attaching to the process using gdb
and determing where the return address on the stack was pointing.
In my case, that was in the middle of telldir().

If you choose to try dmalloc (dmalloc.com), there are some very nice
tutorials on their website for using a debugger to help track memory
issues.

I also used the internal libc malloc() debug options to help confirm
the memory leak, though I wasn't as successful at identifying the
leak with it. It did provide another avenue to confirm that the app was
leaking. There is test code floating around on the telldir() threads
in tech@ that might give you a template for using it, though this
may require a recompile of libc to turn on the MALLOC_STATS option.
It may be simpler to man malloc to see the easiest method for enabling
the memory debugging code buried in libc.

Maybe someone else on the list can give some insight into the "dump"
results of the malloc() stats to see if there is a way to determine
the caller, maybe in conjunction with gbd?

So, that's the approach that worked for me. There may be much
simpler approaches and/or tools depending on the code you are
working on. I'm far from an expert at this ...

Good hunting.

 - Paul

My application uses pthreads and the DNS
resolver, the latter by contacting it via UDP: sendto(), recvfrom().
Note: the memory leak seems to be unique to OpenBSD (3.8 and earlier),
I can't reproduce it on SunOS 5.9 and others. That's why I'm asking
for hints where to look for the leak: is there some "simple" way
to show the allocated memory in the debugger or via system calls
and to find out which functions made those allocations?

Reply via email to