Increasing waittimeout doesn't seem to have any effect on this problem.

I have backtraces of all threads at the point of the hang here:
https://gist.github.com/davidqc/ebee38528b0a40a0b8d028981ad933e6

Thread 19 I think is the culprit:

Thread 19 (Thread 0x7fffaaffd700 (LWP 17652)):
#0  0x00007ffff6322b89 in __libc_waitpid (pid=pid@entry=17651,
stat_loc=stat_loc@entry=0x7fffaaffcde4, options=options@entry=0)
    at ../sysdeps/unix/sysv/linux/waitpid.c:40
#1  0x00007ffff7b5aa4c in Ns_WaitForProcess (pid=17651,
exitcodePtr=0x0) at exec.c:178
#2  0x00007ffff1b68615 in ReaperThread (UNUSED_arg=0x44f3) at nsproxylib.c:2935
#3  0x00007ffff74b886d in NsThreadMain (arg=<optimized out>) at thread.c:232
#4  0x00007ffff74b98a9 in ThreadMain (arg=<optimized out>) at pthread.c:830
#5  0x00007ffff5e500a4 in start_thread (arg=0x7fffaaffd700) at
pthread_create.c:309
#6  0x00007ffff635162d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:111


On 10 May 2017 at 13:04, Gustaf Neumann <neum...@wu.ac.at> wrote:

> We don't see such hangs either. Does increasing the waittimeout solve this
> issue?
> ... not as a fix, but to narrow the problem down.
> -gn
>
> Am 10.05.17 um 13:38 schrieb David Osborne:
>
> The manifestation for us in production is quite insidious and difficult to
> spot the root cause of. Certainly not easy to see via the logs.
>
> One place we are experiencing it is in code which generates outgoing
> emails. Sometimes, when the outgoing email is particularly large, nsproxy
> times out while using a external utility to convert (the large) html to
> text. This never seems like a big issue at the time.
>
> But it's the NEXT task which calls nsproxy which will hang forever without
> error. The then system just slowly grinds to a halt until naviserver is
> restarted.
>
> nscp 2> ns_proxy configure exec
> -env {} -exec /usr/lib/naviserver/bin/nsproxy -init {} -reinit {}
> -maxslaves 8 -maxruns 0 -gettimeout 0 -evaltimeout 0 -sendtimeout 5000
> -recvtimeout 5000 -waittimeout 1000 -idletimeout 300000
>
> As I mentioned, when I watched the hang in gdb, it seemed to be waitpid()
> (via NsWaitProcess) where the hung threads are waiting.
>
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> naviserver-devel mailing list
> naviserver-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/naviserver-devel
>
>


-- 
David Osborne
Qcode Software Limited
http://www.qcode.co.uk
T: +44 (0)1463 896484
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to