On Sat, Dec 26, 2020 at 11:31:32AM +0100, Otto Moerbeek wrote: > On Sat, Dec 26, 2020 at 11:07:00AM +0100, Otto Moerbeek wrote: > > > On Fri, Dec 25, 2020 at 02:04:03PM +0100, Otto Moerbeek wrote: > > > > > On Fri, Dec 25, 2020 at 12:59:10PM +0100, Otto Moerbeek wrote: > > > > > > > On Fri, Dec 25, 2020 at 12:35:57PM +0100, Mark Kettenis wrote: > > > > > > > > > > Date: Fri, 25 Dec 2020 11:34:47 +0100 > > > > > > From: Otto Moerbeek <o...@drijf.net> > > > > > > > > > > > > On Thu, Dec 24, 2020 at 01:29:28PM +0100, Uli Schlachter wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > due to that other thread, it occurred to me that getaddrinfo() > > > > > > > also has > > > > > > > another bug: It leaks memory. _asr_use_resolver() allocates memory > > > > > > > per-thread (via _asr_resolver()) and saves it via > > > > > > > _THREAD_PRIVATE() in > > > > > > > _asr, but nothing frees that memory. A reproducer follows bellow. > > > > > > > On > > > > > > > Debian, no memory leak is observed (= RES in top stays constant > > > > > > > over time). > > > > > > > > > > > > > > I have no good suggestion for how to fix this leak, but I feel > > > > > > > like this > > > > > > > might also be helpful in fixing the thread unsafety from "that > > > > > > > other > > > > > > > thread". Both bugs originate from storing a pointer to an > > > > > > > allocation via > > > > > > > _THREAD_PRIVATE(), which is something that does not really work > > > > > > > with > > > > > > > that API. > > > > > > > > > > > > > > IMHO this internal API needs to change. At this point, one can > > > > > > > also fix > > > > > > > the other problem by having an API that guarantees that each > > > > > > > thread gets > > > > > > > zeroed per-thread data instead of memcpy()ing from a global > > > > > > > default. > > > > > > > > > > > > > > Other users of _THREAD_PRIVATE() instead seem to only store > > > > > > > buffers, > > > > > > > e.g. strerror_l() or localtime(). These buffers do not need extra > > > > > > > cleanup. > > > > > > > > > > > > > > Reproducer (sorry for the line wrapping; this is basically just > > > > > > > the > > > > > > > previous example, but without calling getaddrinfo() on the main > > > > > > > thread: > > > > > > > lots of threads are started and each thread calls getaddrinfo() > > > > > > > once): > > > > > > > > > > > > > > #include <pthread.h> > > > > > > > #include <sys/types.h> > > > > > > > #include <sys/socket.h> > > > > > > > #include <netdb.h> > > > > > > > #include <stdio.h> > > > > > > > #include <string.h> > > > > > > > > > > > > > > #define NUM_THREADS 50 > > > > > > > > > > > > > > static void do_lookup(const char *host) > > > > > > > { > > > > > > > int s; > > > > > > > struct addrinfo hints; > > > > > > > struct addrinfo *result; > > > > > > > > > > > > > > memset(&hints, 0, sizeof(hints)); > > > > > > > hints.ai_family = AF_UNSPEC; > > > > > > > hints.ai_socktype = SOCK_STREAM; > > > > > > > hints.ai_flags = AI_ADDRCONFIG; > > > > > > > hints.ai_protocol = IPPROTO_TCP; > > > > > > > > > > > > > > s = getaddrinfo(host, NULL, &hints, &result); > > > > > > > if (s != 0) { > > > > > > > fprintf(stderr, "Lookup error for %s: %s\n", host, > > > > > > > gai_strerror(s)); > > > > > > > } else { > > > > > > > freeaddrinfo(result); > > > > > > > } > > > > > > > } > > > > > > > > > > > > > > static void * > > > > > > > do_things(void *arg) > > > > > > > { > > > > > > > (void) arg; > > > > > > > do_lookup("ipv4.google.com"); > > > > > > > return NULL; > > > > > > > } > > > > > > > > > > > > > > int main() > > > > > > > { > > > > > > > pthread_t threads[NUM_THREADS]; > > > > > > > int i; > > > > > > > int s; > > > > > > > > > > > > > > for (;;) { > > > > > > > for (i = 0; i < NUM_THREADS; i++) { > > > > > > > s = pthread_create(&threads[i], NULL, > > > > > > > do_things, NULL); > > > > > > > if (s != 0) > > > > > > > fprintf(stderr, "Error creating > > > > > > > thread"); > > > > > > > } > > > > > > > for (i = 0; i < NUM_THREADS; i++) { > > > > > > > pthread_join(threads[i], NULL); > > > > > > > } > > > > > > > } > > > > > > > return 0; > > > > > > > } > > > > > > > > > > > > > > Cheers, > > > > > > > Uli > > > > > > > -- > > > > > > > This can be a, a little complicated. Listen, my advice is... ask > > > > > > > somebody else for advice, at least someone who's... got more > > > > > > > experience > > > > > > > at... giving advice. > > > > > > > > > > > > > > > > > > > Hoi, > > > > > > > > > > > > the diff (which is certainly wip) below fixes the asr related leaks. > > > > > > There's still a leak on creating/destroying threads, in particular > > > > > > the > > > > > > stacks do not seem to be unmapped. > > > > > > > > > > That is (somewhat) expected. Stacks are recycled if they are > > > > > allocated using "default" parameters; see _rthread_free_stack(). > > > > > > > > yes, agreed, they do seem to be reused. I was looking with valgrind, > > > > and this seem to inrtroduce a bug where the stack reuse is effectively > > > > disabled. So this observation was a fluke. > > > > > > > > Still seeing growing mem usage when I create and destroy threads, > > > > though. Strange thing is that while top show increasing SIZE and RES, > > > > ktrace shows no syscall that do allocations, just repeating > > > > > > > > 67230/276969 a.out RET __tfork 445287/0x6cb67 > > > > 67230/614016 a.out RET futex 0 > > > > 67230/521766 a.out CALL __threxit(0xfe46b76aa30) > > > > 67230/276969 a.out CALL __tfork(0x7f7ffffc5e80,24) > > > > 67230/614016 a.out CALL __threxit(0xfe424f44230) > > > > 67230/276969 a.out STRU struct __tfork { tcb=0xfe424f44400, > > > > tid=0xfe424f44430, stack=0xfe3eed35630 } > > > > 67230/445287 a.out RET __tfork 0 > > > > 67230/276969 a.out RET __tfork 118126/0x1cd6e > > > > 67230/445287 a.out CALL futex(0xfe3fa8d7448,0x2<FUTEX_WAKE>,1,0,0) > > > > 67230/276969 a.out CALL __tfork(0x7f7ffffc5e80,24) > > > > 67230/118126 a.out RET __tfork 0 > > > > 67230/445287 a.out RET futex 0 > > > > 67230/276969 a.out STRU struct __tfork { tcb=0xfe3fa8d7200, > > > > tid=0xfe3fa8d7230, stack=0xfe403a719b0 } > > > > 67230/118126 a.out CALL futex(0xfe424f44448,0x2<FUTEX_WAKE>,1,0,0) > > > > 67230/445287 a.out CALL __threxit(0xfe3fa8d7430) > > > > 67230/276969 a.out RET __tfork 456926/0x6f8de > > > > > > > > > > > > Could this indicate that one of these syscalls introduces a mem leak > > > > in the process? > > > > > > Using procmap I'm seeing single page anon regions being added to the > > > process. > > > A suspect would be uvm_uarea_alloc() in thread_fork. > > > > Added some debug code and uvm_uarea_free() is called on the thread (in > > the repaer) shortly after the thread exists. So that part seems to be > > ok. Puzzled what other mapping could show up in the process when > > creating/destroying threads. > > > > -Otto > > > > Sorry, I misdiagnosed, the process itself is doing the occasioal mmap > itself. It happens every +/- 1000 iterations when runinng the > noop tets program with two threads. Now I need to find out where that > mmap call is coming form. > > -Otto >
The extra mmap calls were coming from ld.so (that's why I did not spot in form gdb). After some head scratching it turns out to be a stupid thing in ld.so/malloc.c Adding this to the already posted diff makes the leak go away and the noop threading test runs in constant memory. I introduced this error myself in ld.so/malloc.c rev 1.24. -Otto Index: malloc.c =================================================================== RCS file: /cvs/src/libexec/ld.so/malloc.c,v retrieving revision 1.30 diff -u -p -U10 -r1.30 malloc.c --- malloc.c 30 Sep 2019 03:35:09 -0000 1.30 +++ malloc.c 26 Dec 2020 12:32:31 -0000 @@ -575,23 +575,20 @@ omalloc_make_chunks(struct dir_info *d, if (pp == MAP_FAILED) return NULL; bp = alloc_chunk_info(d, bits); if (bp == NULL) goto err; /* memory protect the page allocated in the malloc(0) case */ if (bits == 0 && _dl_mprotect(pp, MALLOC_PAGESIZE, PROT_NONE) < 0) goto err; - bp = alloc_chunk_info(d, bits); - if (bp == NULL) - goto err; bp->page = pp; if (insert(d, (void *)((uintptr_t)pp | (bits + 1)), (uintptr_t)bp)) goto err; LIST_INSERT_HEAD(&d->chunk_dir[bits][listnum], bp, entries); return bp; err: unmap(d, pp, MALLOC_PAGESIZE, MALLOC_JUNK); return NULL;