On Sat, Dec 26, 2020 at 01:39:55PM +0100, Otto Moerbeek wrote:
> On Sat, Dec 26, 2020 at 11:31:32AM +0100, Otto Moerbeek wrote:
> 
> > On Sat, Dec 26, 2020 at 11:07:00AM +0100, Otto Moerbeek wrote:
> > 
> > > On Fri, Dec 25, 2020 at 02:04:03PM +0100, Otto Moerbeek wrote:
> > > 
> > > > On Fri, Dec 25, 2020 at 12:59:10PM +0100, Otto Moerbeek wrote:
> > > > 
> > > > > On Fri, Dec 25, 2020 at 12:35:57PM +0100, Mark Kettenis wrote:
> > > > > 
> > > > > > > Date: Fri, 25 Dec 2020 11:34:47 +0100
> > > > > > > From: Otto Moerbeek <o...@drijf.net>
> > > > > > > 
> > > > > > > On Thu, Dec 24, 2020 at 01:29:28PM +0100, Uli Schlachter wrote:
> > > > > > > 
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > due to that other thread, it occurred to me that getaddrinfo() 
> > > > > > > > also has
> > > > > > > > another bug: It leaks memory. _asr_use_resolver() allocates 
> > > > > > > > memory
> > > > > > > > per-thread (via _asr_resolver()) and saves it via 
> > > > > > > > _THREAD_PRIVATE() in
> > > > > > > > _asr, but nothing frees that memory. A reproducer follows 
> > > > > > > > bellow. On
> > > > > > > > Debian, no memory leak is observed (= RES in top stays constant 
> > > > > > > > over time).
> > > > > > > > 
> > > > > > > > I have no good suggestion for how to fix this leak, but I feel 
> > > > > > > > like this
> > > > > > > > might also be helpful in fixing the thread unsafety from "that 
> > > > > > > > other
> > > > > > > > thread". Both bugs originate from storing a pointer to an 
> > > > > > > > allocation via
> > > > > > > > _THREAD_PRIVATE(), which is something that does not really work 
> > > > > > > > with
> > > > > > > > that API.
> > > > > > > > 
> > > > > > > > IMHO this internal API needs to change. At this point, one can 
> > > > > > > > also fix
> > > > > > > > the other problem by having an API that guarantees that each 
> > > > > > > > thread gets
> > > > > > > > zeroed per-thread data instead of memcpy()ing from a global 
> > > > > > > > default.
> > > > > > > > 
> > > > > > > > Other users of _THREAD_PRIVATE() instead seem to only store 
> > > > > > > > buffers,
> > > > > > > > e.g. strerror_l() or localtime(). These buffers do not need 
> > > > > > > > extra cleanup.
> > > > > > > > 
> > > > > > > > Reproducer (sorry for the line wrapping; this is basically just 
> > > > > > > > the
> > > > > > > > previous example, but without calling getaddrinfo() on the main 
> > > > > > > > thread:
> > > > > > > > lots of threads are started and each thread calls getaddrinfo() 
> > > > > > > > once):
> > > > > > > > 
> > > > > > > > #include <pthread.h>
> > > > > > > > #include <sys/types.h>
> > > > > > > > #include <sys/socket.h>
> > > > > > > > #include <netdb.h>
> > > > > > > > #include <stdio.h>
> > > > > > > > #include <string.h>
> > > > > > > > 
> > > > > > > > #define NUM_THREADS 50
> > > > > > > > 
> > > > > > > > static void do_lookup(const char *host)
> > > > > > > > {
> > > > > > > >         int s;
> > > > > > > >         struct addrinfo hints;
> > > > > > > >         struct addrinfo *result;
> > > > > > > > 
> > > > > > > >         memset(&hints, 0, sizeof(hints));
> > > > > > > >         hints.ai_family = AF_UNSPEC;
> > > > > > > >         hints.ai_socktype = SOCK_STREAM;
> > > > > > > >         hints.ai_flags = AI_ADDRCONFIG;
> > > > > > > >         hints.ai_protocol = IPPROTO_TCP;
> > > > > > > > 
> > > > > > > >         s = getaddrinfo(host, NULL, &hints, &result);
> > > > > > > >         if (s != 0) {
> > > > > > > >                 fprintf(stderr, "Lookup error for %s: %s\n", 
> > > > > > > > host, gai_strerror(s));
> > > > > > > >         } else {
> > > > > > > >                 freeaddrinfo(result);
> > > > > > > >         }
> > > > > > > > }
> > > > > > > > 
> > > > > > > > static void *
> > > > > > > > do_things(void *arg)
> > > > > > > > {
> > > > > > > >         (void) arg;
> > > > > > > >         do_lookup("ipv4.google.com");
> > > > > > > >         return NULL;
> > > > > > > > }
> > > > > > > > 
> > > > > > > > int main()
> > > > > > > > {
> > > > > > > >         pthread_t threads[NUM_THREADS];
> > > > > > > >         int i;
> > > > > > > >         int s;
> > > > > > > > 
> > > > > > > >         for (;;) {
> > > > > > > >                 for (i = 0; i < NUM_THREADS; i++) {
> > > > > > > >                         s = pthread_create(&threads[i], NULL, 
> > > > > > > > do_things, NULL);
> > > > > > > >                         if (s != 0)
> > > > > > > >                                 fprintf(stderr, "Error creating 
> > > > > > > > thread");
> > > > > > > >                 }
> > > > > > > >                 for (i = 0; i < NUM_THREADS; i++) {
> > > > > > > >                         pthread_join(threads[i], NULL);
> > > > > > > >                 }
> > > > > > > >         }
> > > > > > > >         return 0;
> > > > > > > > }
> > > > > > > > 
> > > > > > > > Cheers,
> > > > > > > > Uli
> > > > > > > > -- 
> > > > > > > > This can be a, a little complicated. Listen, my advice is... ask
> > > > > > > > somebody else for advice, at least someone who's... got more 
> > > > > > > > experience
> > > > > > > > at...  giving advice.
> > > > > > > > 
> > > > > > > 
> > > > > > > Hoi,
> > > > > > > 
> > > > > > > the diff (which is certainly wip) below fixes the asr related 
> > > > > > > leaks.
> > > > > > > There's still a leak on creating/destroying threads, in 
> > > > > > > particular the
> > > > > > > stacks do not seem to be unmapped.
> > > > > > 
> > > > > > That is (somewhat) expected.  Stacks are recycled if they are
> > > > > > allocated using "default" parameters; see _rthread_free_stack().
> > > > > 
> > > > > yes, agreed, they do seem to be reused. I was looking with valgrind,
> > > > > and this seem to inrtroduce a bug where the stack reuse is effectively
> > > > > disabled. So this observation was a fluke.
> > > > > 
> > > > > Still seeing growing mem usage when I create and destroy threads,
> > > > > though. Strange thing is that while top show increasing SIZE and RES,
> > > > > ktrace shows no syscall that do allocations, just repeating
> > > > > 
> > > > >  67230/276969  a.out    RET   __tfork 445287/0x6cb67
> > > > >  67230/614016  a.out    RET   futex 0
> > > > >  67230/521766  a.out    CALL  __threxit(0xfe46b76aa30)
> > > > >  67230/276969  a.out    CALL  __tfork(0x7f7ffffc5e80,24)
> > > > >  67230/614016  a.out    CALL  __threxit(0xfe424f44230)
> > > > >  67230/276969  a.out    STRU  struct __tfork { tcb=0xfe424f44400,
> > > > > tid=0xfe424f44430, stack=0xfe3eed35630 }
> > > > >  67230/445287  a.out    RET   __tfork 0
> > > > >  67230/276969  a.out    RET   __tfork 118126/0x1cd6e
> > > > >  67230/445287  a.out    CALL  
> > > > > futex(0xfe3fa8d7448,0x2<FUTEX_WAKE>,1,0,0)
> > > > >  67230/276969  a.out    CALL  __tfork(0x7f7ffffc5e80,24)
> > > > >  67230/118126  a.out    RET   __tfork 0
> > > > >  67230/445287  a.out    RET   futex 0
> > > > >  67230/276969  a.out    STRU  struct __tfork { tcb=0xfe3fa8d7200,
> > > > > tid=0xfe3fa8d7230, stack=0xfe403a719b0 }
> > > > >  67230/118126  a.out    CALL  
> > > > > futex(0xfe424f44448,0x2<FUTEX_WAKE>,1,0,0)
> > > > >  67230/445287  a.out    CALL  __threxit(0xfe3fa8d7430)
> > > > >  67230/276969  a.out    RET   __tfork 456926/0x6f8de
> > > > > 
> > > > > 
> > > > > Could this indicate that one of these syscalls introduces a mem leak
> > > > > in the process?
> > > > 
> > > > Using procmap I'm seeing single page anon regions being added to the 
> > > > process.
> > > > A suspect would be uvm_uarea_alloc() in thread_fork.
> > > 
> > > Added some debug code and uvm_uarea_free() is called on the thread (in
> > > the repaer) shortly after the thread exists. So that part seems to be
> > > ok.  Puzzled what other mapping could show up in the process when
> > > creating/destroying threads.
> > > 
> > >   -Otto
> > > 
> > 
> > Sorry, I misdiagnosed, the process itself is doing the occasioal mmap
> > itself.  It happens every +/- 1000 iterations when runinng the
> > noop tets program with two threads. Now I need to find out where that
> > mmap call is coming form.
> > 
> >     -Otto
> > 
> 
> The extra mmap calls were coming from ld.so (that's why I did not spot
> in form gdb).  After some head scratching it turns out to be a stupid
> thing in ld.so/malloc.c
> 
> Adding this to the already posted diff makes the leak go away and the
> noop threading test runs in constant memory.
> 
> I introduced this error myself in ld.so/malloc.c rev 1.24.

Thank you for finding this one. This also solves a mystery that has been
driving me up the wall for many weeks now.

ok tb

> 
>       -Otto
> 
> Index: malloc.c
> ===================================================================
> RCS file: /cvs/src/libexec/ld.so/malloc.c,v
> retrieving revision 1.30
> diff -u -p -U10 -r1.30 malloc.c
> --- malloc.c  30 Sep 2019 03:35:09 -0000      1.30
> +++ malloc.c  26 Dec 2020 12:32:31 -0000
> @@ -575,23 +575,20 @@ omalloc_make_chunks(struct dir_info *d, 
>       if (pp == MAP_FAILED)
>               return NULL;
>  
>       bp = alloc_chunk_info(d, bits);
>       if (bp == NULL)
>               goto err;
>       /* memory protect the page allocated in the malloc(0) case */
>       if (bits == 0 && _dl_mprotect(pp, MALLOC_PAGESIZE, PROT_NONE) < 0)
>               goto err;
>  
> -     bp = alloc_chunk_info(d, bits);
> -     if (bp == NULL)
> -             goto err;
>       bp->page = pp;
>  
>       if (insert(d, (void *)((uintptr_t)pp | (bits + 1)), (uintptr_t)bp))
>               goto err;
>       LIST_INSERT_HEAD(&d->chunk_dir[bits][listnum], bp, entries);
>       return bp;
>  
>  err:
>       unmap(d, pp, MALLOC_PAGESIZE, MALLOC_JUNK);
>       return NULL;
> 

Reply via email to