I'll look at it at the weekend - unless someone else can fix this before this.

-g
Am 10.05.17 um 16:00 schrieb David Osborne:
 Increasing waittimeout doesn't seem to have any effect on this problem.

I have backtraces of all threads at the point of the hang here:
https://gist.github.com/davidqc/ebee38528b0a40a0b8d028981ad933e6

Thread 19 I think is the culprit:

Thread 19 (Thread 0x7fffaaffd700 (LWP 17652)):
#0  0x00007ffff6322b89 in __libc_waitpid (pid=pid@entry=17651, 
stat_loc=stat_loc@entry=0x7fffaaffcde4, options=options@entry=0)
     at ../sysdeps/unix/sysv/linux/waitpid.c:40
#1  0x00007ffff7b5aa4c in Ns_WaitForProcess (pid=17651, exitcodePtr=0x0) at 
exec.c:178
#2  0x00007ffff1b68615 in ReaperThread (UNUSED_arg=0x44f3) at nsproxylib.c:2935
#3  0x00007ffff74b886d in NsThreadMain (arg=<optimized out>) at thread.c:232
#4  0x00007ffff74b98a9 in ThreadMain (arg=<optimized out>) at pthread.c:830
#5  0x00007ffff5e500a4 in start_thread (arg=0x7fffaaffd700) at 
pthread_create.c:309
#6  0x00007ffff635162d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:111

On 10 May 2017 at 13:04, Gustaf Neumann <neum...@wu.ac.at <mailto:neum...@wu.ac.at>> wrote:

    We don't see such hangs either. Does increasing the waittimeout
    solve this issue?
    ... not as a fix, but to narrow the problem down.
    -gn

    Am 10.05.17 um 13:38 schrieb David Osborne:
    The manifestation for us in production is quite insidious and
    difficult to spot the root cause of. Certainly not easy to see
    via the logs.

    One place we are experiencing it is in code which generates
    outgoing emails. Sometimes, when the outgoing email is
    particularly large, nsproxy times out while using a external
    utility to convert (the large) html to text. This never seems
    like a big issue at the time.

    But it's the NEXT task which calls nsproxy which will hang
    forever without error. The then system just slowly grinds to a
    halt until naviserver is restarted.

    nscp 2> ns_proxy configure exec
    -env {} -exec /usr/lib/naviserver/bin/nsproxy -init {} -reinit {}
    -maxslaves 8 -maxruns 0 -gettimeout 0 -evaltimeout 0 -sendtimeout
    5000 -recvtimeout 5000 -waittimeout 1000 -idletimeout 300000

    As I mentioned, when I watched the hang in gdb, it seemed to be
    waitpid() (via NsWaitProcess) where the hung threads are waiting.


    
------------------------------------------------------------------------------
    Check out the vibrant tech community on one of the world's most
    engaging tech sites, Slashdot.org! http://sdm.link/slashdot
    _______________________________________________
    naviserver-devel mailing list
    naviserver-devel@lists.sourceforge.net
    <mailto:naviserver-devel@lists.sourceforge.net>
    https://lists.sourceforge.net/lists/listinfo/naviserver-devel
    <https://lists.sourceforge.net/lists/listinfo/naviserver-devel>




--
David Osborne
Qcode Software Limited
http://www.qcode.co.uk
T: +44 (0)1463 896484



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


--
Univ.Prof. Dr. Gustaf Neumann
WU Vienna
Institute of Information Systems and New Media
Welthandelsplatz 1, A-1020 Vienna, Austria

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to