Re: [naviserver-devel] ns_cond is broken on Naviserver

2014-10-18 Thread Gustaf Neumann
Am 18.10.14 11:34, schrieb Andrew Piskorski:
> I tried to bisect, but getting old versions of Naviserver to actually
> build and run tests is extraordinarily frustrating.  I gave up.
Dear Andrew, i have fixed the problem in the head branch. Your test is 
working for
me, at least under Mac OS X and Linux.

-g

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] sometimes crashes on shutdown

2014-10-18 Thread Gustaf Neumann
Am 18.10.14 11:45, schrieb Andrew Piskorski:
> Btw, simply running all the tests with "make test" sometimes crashes
> Naviserver while it is shutting down; see below.
Dear Andrew,

this is a known problem (see e.g. [1]) which is not nice but not 
serious, since it happens
very late in the shutdown. My understanding is that it happens due to 
varying deletion orders
during shutdown, probably influenced by the traversal order in hash 
tables. Hard to debug,
but for practical applications without much value.

-g

https://www.mail-archive.com/naviserver-devel@lists.sourceforge.net/msg02948.html



--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


[naviserver-devel] sometimes crashes on shutdown

2014-10-18 Thread Andrew Piskorski
Btw, simply running all the tests with "make test" sometimes crashes
Naviserver while it is shutting down; see below.

Or just running this one test (which is fast) several times works
pretty well for triggering a crash.  Other tests do it too though, not
just this one:

  make test TCLTESTARGS='-file ns_nsv.test'


[18/Oct/2014:05:27:59][18593.2ac57b648240][-main-] Notice: nsmain: 
NaviServer/4.99.6 stopping
[18/Oct/2014:05:27:59][18593.2ac57b648240][-main-] Notice: [driver:nssock]: 
stopping
[18/Oct/2014:05:27:59][18593.2ac57b648240][-main-] Notice: server [test]: 
stopping
[18/Oct/2014:05:27:59][18593.2ac57b648240][-main-] Notice: server [testvhost]: 
stopping
[18/Oct/2014:05:27:59][18593.2ac57b648240][-main-] Notice: server [testvhost2]: 
stopping
[18/Oct/2014:05:27:59][18593.2ac5a4200700][-driver:nssock-] Notice: exiting
[18/Oct/2014:05:27:59][18593.2ac588c10700][-conn:test:1] Notice: exiting: 
shutdown pending
[18/Oct/2014:05:27:59][18593.2ac588e11700][-conn:testvhost:0] Notice: exiting: 
shutdown pending
[18/Oct/2014:05:27:59][18593.2ac589012700][-conn:testvhost2:0] Notice: exiting: 
shutdown pending
[18/Oct/2014:05:27:59][18593.2ac588a0f700][-conn:test:0] Notice: exiting: 
shutdown pending
[18/Oct/2014:05:27:59][18593.2ac58880e700][-conn:test:emergency:0] Notice: 
exiting: shutdown pending
called Tcl_FindHashEntry on deleted table
Aborted (core dumped)
make: *** [test] Error 134


Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `./nsd/nsd -u root -c -d -t 
/usr/local/web/ns-fork/nv-tcl85-linux/tests/test.nsc'.
Program terminated with signal 6, Aborted.
(gdb) bt
#0  0x2ac57a22d425 in __GI_raise (sig=)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x2ac57a230b8b in __GI_abort () at abort.c:91
#2  0x2ac57acb634e in Tcl_PanicVA () from /usr/lib/libtcl8.5.so.0
#3  0x2ac57acb640c in Tcl_Panic () from /usr/lib/libtcl8.5.so.0
#4  0x2ac57ac8e079 in ?? () from /usr/lib/libtcl8.5.so.0
#5  0x2ac57acd12fa in ?? () from /usr/lib/libtcl8.5.so.0
#6  0x2ac57accff5e in Tcl_GetThreadData () from /usr/lib/libtcl8.5.so.0
#7  0x2ac57acb3cfa in TclFreeObj () from /usr/lib/libtcl8.5.so.0
#8  0x2ac57acb5d45 in ?? () from /usr/lib/libtcl8.5.so.0
#9  0x2ac57ac8db82 in Tcl_DeleteHashTable () from /usr/lib/libtcl8.5.so.0
#10 0x2ac57ac730b8 in ?? () from /usr/lib/libtcl8.5.so.0
#11 0x2ac57acb3d59 in TclFreeObj () from /usr/lib/libtcl8.5.so.0
#12 0x2ac57acb222d in ?? () from /usr/lib/libtcl8.5.so.0
#13 0x2ac57ac3b995 in Tcl_DeleteCommandFromToken ()
   from /usr/lib/libtcl8.5.so.0
#14 0x2ac57acac0dc in TclTeardownNamespace () from /usr/lib/libtcl8.5.so.0
#15 0x2ac57ac3a3dc in ?? () from /usr/lib/libtcl8.5.so.0
#16 0x2ac579f9062e in DeleteInterps (arg=0x20be470) at tclinit.c:1861
#17 0x2ac57aa0c814 in NsCleanupTls (slots=0x31ac220) at tls.c:186
#18 0x2ac57aa0d16a in CleanupTls (arg=) at pthread.c:816
#19 0x2ac57b135c83 in __nptl_deallocate_tsd () at pthread_create.c:156
#20 0x2ac57b135ea8 in start_thread (arg=0x2ac58880e700)
at pthread_create.c:315
#21 0x2ac57a2eb3fd in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#22 0x in ?? ()
(gdb) 

-- 
Andrew Piskorski 

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] ns_cond is broken on Naviserver

2014-10-18 Thread Andrew Piskorski
I tried to bisect, but getting old versions of Naviserver to actually
build and run tests is extraordinarily frustrating.  I gave up.

I see various places where Naviserver itself uses the Ns_Cond* C
functions, which makes me think they likely work.  Also, the Unix and
Windows C implementations are in two different files.  Since the
problems I see with Tcl ns_cond are the same on Linux and Windows,
perhaps the problem is in nsd/tclthread.c, which is cross-platform.

Most of NsTclCondObjCmd() dates from 2005-10-21, when Stephen Deasey
pretty much completely rewrote it.  Maybe that's a likely place for
bugs?  I wasn't able to build or test the code from around then
though, so it's just an arbitrary guess on my part.

-- 
Andrew Piskorski 

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


[naviserver-devel] ns_cond is broken on Naviserver

2014-10-18 Thread Andrew Piskorski
I added a test of correct ns_cond behavior here:
  
https://bitbucket.org/apiskors/naviserver/commits/2ed58b6a9c0567fe7cbee1080e30274b2aba64be

Which adds these two new files:
  tests/ns_cond.test
  tests/testserver/modules/ns_cond.tcl 

To run the test on Linux, just do:
  make test TCLTESTARGS='-file ns_cond.test' 

As you can see from the log output below, ns_cond works fine on
AOLserver, but not on Naviserver.  On Naviserver, once the worker
thread calls "ns_cond wait" the first time, it just never wakes up.
It's as if the cond signal just never makes it here.  This was run on
Linux; behavior on Windows is the same.


# BUG on Naviserver, the worker never wakes up:
ns_cond.test
[18/Oct/2014:02:34:50][23566.2aadb6049700][-command-] Notice: tst_cond_master: 
New thread '' started for running tst_cond_worker.
[18/Oct/2014:02:34:50][23566.2aae0c401700][-thread:2aae0c401700-] Notice: 
tst_cond_worker: 2 work items:  0 1
[18/Oct/2014:02:34:54][23566.2aadb6049700][-command-] Notice: tst_cond_master: 
2 work items done:  0 1
[18/Oct/2014:02:34:54][23566.2aadb6049700][-command-] Notice: tst_cond_master: 
3 work items NOT done:  2 3 4
 ns_cond-1.1 Master uses ns_cond to wake up a worker thread. FAILED

# Correct behavior on AOLserver 4.0.10:
[18/Oct/2014:02:32:38][16918.18446744071699884032][-conn:outpost-dev::0] 
Notice: tst_cond_master: New thread '' started for running tst_cond_worker.
[18/Oct/2014:02:32:38][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: 2 work items:  0 1
[18/Oct/2014:02:32:39][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: Event 'eid0x7fb97ca16600' - got it.
[18/Oct/2014:02:32:39][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: 1 work items:  2
[18/Oct/2014:02:32:40][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: Event 'eid0x7fb97ca16600' - got it.
[18/Oct/2014:02:32:40][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: 1 work items:  3
[18/Oct/2014:02:32:41][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: Event 'eid0x7fb97ca16600' - got it.
[18/Oct/2014:02:32:41][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: 1 work items:  4
[18/Oct/2014:02:32:42][16918.18446744071699884032][-conn:outpost-dev::0] 
Notice: tst_cond_master: 5 work items done:  0 1 2 3 4
[18/Oct/2014:02:32:42][16918.18446744071699884032][-conn:outpost-dev::0] 
Notice: tst_cond_master: 0 work items NOT done:  
[18/Oct/2014:02:32:42][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: Event 'eid0x7fb97ca16600' - got it.
[18/Oct/2014:02:32:42][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: 0 work items:  
[18/Oct/2014:02:32:42][16918.18446744071622727680][-thread-2086823936-] Notice: 
tst_cond_worker: No more work for me today.

-- 
Andrew Piskorski 

--
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel