Re: [naviserver-devel] scheduler thread getting stuck

2020-06-16 Thread Andrew Piskorski
On Sun, Jun 14, 2020 at 02:44:35PM +0200, Gustaf Neumann wrote:

> i fixed two more bugs for win64 (see [1]). The most complex
> case was handling thread results (threads returning string values).

Wow, great news Gustaf, thank you very much.

> @Andrew: are the still show-stoppers for you,
> which have to be fixed urgently?

No, everything for my application is working well now!  I've been
running your 2020-06-07 time_t Ns_Time fix for a week, that completely
fixed the problem with the scheduler thread getting stuck.  I just
recently upgraded to your 06-14 fixes as well.

The other Windows regression test failures don't seem to affect me,
but I'll still put some time into some them if/when I come up with any
better ideas for how to debug them.  And once Ibrahim Tannir gets his
nsproxy fixes ready, I can certainly try them.

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-14 Thread Gustaf Neumann

Dear all,

i fixed two more bugs for win64 (see [1]). The most complex
case was handling thread results (threads returning string values).
While pthread_exit() receives a pointer value (64 bit), the native
windows counter part _endthreadex() receives just a 32 bit value
(both, on win32 and win64). Since the received result is used
for setting the result-obj,  the value truncation caused many
bad things to happen. This could never have worked with
win64 before.

Now all the tests of ns_thread.test should work correctly.

@Andrew: are the still show-stoppers for you,
which have to be fixed urgently?

-gn

[1] 
https://bitbucket.org/naviserver/naviserver/commits/9c48894ae8e433aa4dfbe5473e9553f796ec24bd


On 08.06.20 17:46, Gustaf Neumann wrote:

No change to the other failing tests, nor to the ones that we're
currently skipping with the notWin32 constraint.
E.g., test ns_thread-2.6 still triggers this:
   Assertion failed: tid != NULL, file tclthread.c, line 238

i am not surprised, since i have not changed anything around this.




___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-08 Thread Gustaf Neumann

On 08.06.20 19:39, Andrew Piskorski wrote:

On Windows there are still a few compiler warnings that look a little
suspicious (below), but I don't see any good way to fix these.


it is not hard to silence these cases (at least one of these appeared 
multiple times on stackoverflow), but these are not related to the 
errors you have reported.


i hope, the next weekend, i can get a better PC for continuing on this.

-g



___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-08 Thread Andrew Piskorski
On Mon, Jun 08, 2020 at 05:46:54PM +0200, Gustaf Neumann wrote:
> >Assertion failed: tid != NULL, file tclthread.c, line 238

> You might check whether "ns_thread handle"
> in a classical setup (e.g. in a ds/shell) thows the same exception.

Good idea.  I started up NaviServer with the same test.nscfg config
file, but using the installed binaries instead of the "nmake -f
Makefile.win32 _test" approach.  Then I typed "ns_thread handle" at
the control port prompt.

That threw the exact same exception as before.  Under WinDbg it also
looks the same, inside Ns_ThreadSelf() wPtr appears to be defined, but
threadPtr and wPtr->self are null.

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-08 Thread Andrew Piskorski
On Windows there are still a few compiler warnings that look a little
suspicious (below), but I don't see any good way to fix these.


cl /W3 /nologo /c /EHsc /MDd /Od /Zi /RTC1 /I "..\include" /I 
"C:\P\OpenSSL-Win64\include"  /I "C:\P\Tcl-64-8.6\include" /D "_WINDOWS" /D 
"TCL_THREADS=1"  /D "FD_SETSIZE=128" /D "_MBCS"  /D _CRT_SECURE_NO_WARNINGS /D 
_CRT_SECURE_NO_DEPRECATE /D "_DEBUG" /c /Foexec.o exec.c
exec.c(154): warning C4312: 'type cast': conversion from 'pid_t' to 'HANDLE' of 
greater size
exec.c(371): warning C4311: 'type cast': pointer truncation from 'HANDLE' to 
'pid_t'

cl /W3 /nologo /c /EHsc /MDd /Od /Zi /RTC1 /I "..\include" /I 
"C:\P\OpenSSL-Win64\include"  /I "C:\P\Tcl-64-8.6\include" /D "_WINDOWS" /D 
"TCL_THREADS=1"  /D "FD_SETSIZE=128" /D "_MBCS"  /D _CRT_SECURE_NO_WARNINGS /D 
_CRT_SECURE_NO_DEPRECATE /D "_DEBUG" /c /Fotls.o tls.c
tls.c(228): warning C4244: 'function': conversion from 'SOCKET' to 'int', 
possible loss of data
tls.c(376): warning C4244: 'function': conversion from 'SOCKET' to 'int', 
possible loss of data

cl /W3 /nologo /c /EHsc /MDd /Od /Zi /RTC1 /I "..\include" /I 
"C:\P\OpenSSL-Win64\include"  /I "C:\P\Tcl-64-8.6\include" /D "_WINDOWS" /D 
"TCL_THREADS=1"  /D "FD_SETSIZE=128" /D "_MBCS"  /D _CRT_SECURE_NO_WARNINGS /D 
_CRT_SECURE_NO_DEPRECATE /D "_DEBUG" /c /Fotclcrypto.o tclcrypto.c
tclcrypto.c(592): warning C4090: 'initializing': different 'const' qualifiers
tclcrypto.c(656): warning C4090: 'initializing': different 'const' qualifiers
tclcrypto.c(711): warning C4090: 'initializing': different 'const' qualifiers
tclcrypto.c(822): warning C4090: 'initializing': different 'const' qualifiers
tclcrypto.c(955): warning C4090: 'initializing': different 'const' qualifiers
tclcrypto.c(1011): warning C4090: 'initializing': different 'const' qualifiers
tclcrypto.c(1068): warning C4090: 'initializing': different 'const' qualifiers

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-08 Thread Gustaf Neumann

On 08.06.20 16:32, Andrew Piskorski wrote:

On Mon, Jun 08, 2020 at 12:04:59PM +0200, Gustaf Neumann wrote:

So, i have modified the code to use "time_t" for the "sec" member,
... and many of the warnings disappeared.

That's a big improvement, thank you, Gustaf!  The 22 regression tests
below used to fail, but now pass!

good news!

No change to the other failing tests, nor to the ones that we're
currently skipping with the notWin32 constraint.
E.g., test ns_thread-2.6 still triggers this:
   Assertion failed: tid != NULL, file tclthread.c, line 238

i am not surprised, since i have not changed anything around this.
The problem might have to to do with the different way of the
setup for tests. You might check whether "ns_thread handle"
in a classical setup (e.g. in a ds/shell) thows the same exception.

-gn



___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-08 Thread Andrew Piskorski
On Mon, Jun 08, 2020 at 12:04:59PM +0200, Gustaf Neumann wrote:

> So, i have modified the code to use "time_t" for the "sec" member,
> ... and many of the warnings disappeared.

That's a big improvement, thank you, Gustaf!  The 22 regression tests
below used to fail, but now pass!

No change to the other failing tests, nor to the ones that we're
currently skipping with the notWin32 constraint.
E.g., test ns_thread-2.6 still triggers this:
  Assertion failed: tid != NULL, file tclthread.c, line 238


## Gustaf's 2020-06-07 changes fixed these test failures:
 ns_schedule-2.1 schedule proc: interval FAILED
 ns_time-1.2ms ns_time incr timeunit float+ms int FAILED
 ns_time-1.3ms ns_time incr timeunit int+ms int FAILED
 ns_time-1.3?s ns_time incr timeunit int+ms int FAILED
 ns_time-1.4-100ms ns_time incr timeunit 100ms int FAILED
 ns_time-1.4-10ms ns_time incr timeunit 10ms int FAILED
 ns_time-1.4-1ms ns_time incr timeunit 1ms int FAILED
 ns_time-1.4-0.1ms ns_time incr timeunit 0.1ms int FAILED
 ns_time-1.4-0.01ms ns_time incr timeunit 0.01ms int FAILED
 ns_time-1.4-0.001ms ns_time incr timeunit 0.001ms int FAILED
 ns_time-format-1.2 ns_time format positive microsecond FAILED
 ns_time-format-2.1 ns_time format negative second FAILED
 ns_time-format-2.2 ns_time format negative second with fraction FAILED
 ns_time-format-2.4 ns_time format negative microsecond FAILED
 ns_time-format-2.4-0.001ms ns_time format negative microsecond FAILED
 ns_time-diff-1 ns_time diff simple FAILED
 ns_time-diff-2 ns_time diff requires adjust FAILED
 ns_time-diff-3 ns_time diff subtract nothing FAILED
 ns_time-diff-4 ns_time diff add 1ms FAILED
 ns_time-diff-5 ns_time diff turn positive to negative FAILED
 ns_time-diff-6 ns_time diff make negative more negative FAILED
 ns_time-diff-9 ns_time diff turn negative to positive FAILED

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-08 Thread Gustaf Neumann

On 04.06.20 17:26, Gustaf Neumann wrote:

This sounds indeed related with the original problem.
The test registers a repeating proc (interval 1s),
but within in the time-range of 2.5s, it is executed
only once.

...

maybe i get on the weekend some access to a win environent.

i could use a windows machine over the weekend, but unfortunately,
this was very limited (windows 7, very small hd).

However, i was able to set everything up to be able to compile
NaviServer with msvc, but i was not able to run the regression tests
(path to long, etc.). When compiling with x64, there were many
warnings concerning the "sec" member in Ns_Time, which is
defined as long. Due to the memory model in windows 64
bit (LLP64) a long is there 32 bit, ... but an ns_time (e.g. the result
of time()) is 64 bit. This value is often supplied to the "sec" member.
So, i have modified the code to use "time_t" for the "sec" member,
... and many of the warnings disappeared.

Most other 64bit OS use LP64 (long is 64 bit), where assigning
time_t to long was not an issue.

This change will not solve all of the issues you are experiencing,
bit it might improve the situation for a few.

Background: The problem with LLP64 and using "long" for
sec is not new, many of the operations on Ns_Time were most
likely never working correctly under win64. But they started to
show up as a problem lately, since the newer code relies
more on this functions working correctly (among other things,
in the scheduler).

Hope that these changes helped a little.

all the best

-gn



___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-06 Thread Andrew Piskorski
On Thu, Jun 04, 2020 at 04:55:10PM -0400, Andrew Piskorski wrote:
> On Thu, Jun 04, 2020 at 05:26:04PM +0200, Gustaf Neumann wrote:

> > Probably "Ns_ThreadSelf();" does not work under windows (get the
> > id of the current thread). Ns_ThreadSelf() is defined in the OS specific
> > part (winthread.c). The exception is probably coming from
> > test thread-2.3, it looks to me as if the the thread (here the
> > thread running the tests) is not properly initiated under windows.

>   Assertion failed: tid != NULL, file tclthread.c, line 238

For debugging, I turned test ns_thread-2.6 back on, and added an
assertion inside Ns_ThreadSelf(), so that it was basically doing this:

void Ns_ThreadSelf(Ns_Thread *threadPtr) {
   WinThread *wPtr = TlsGetValue(tlskey);
   *threadPtr = (Ns_Thread) wPtr->self;
   assert(NULL != *threadPtr);
}

That got it to break into Microsoft's WinDbg debugger there, rather
than later in NsTclThreadObjCmd().  The global "tlskey" seems to be
initialized, and so does the "wPtr" WinThread pointer.  But the
"wPtr->self" looks like it's NULL, there is no Ns_Thread structure
stored there!

I see the wPtr WinThread allocation code in DllMain().  That seems to
be working fine.

The wPtr->self Ns_Thread stuff gets set up in NsCreateThread() and
ThreadMain(), but I don't really understand understand what that code
is doing.  Is that the place where something is going wrong?

Btw, The tlskey TLS index value looks like it's a 64-bit DWORD
(unsigned integer), not 32-bit.  So I think Nsthreads_LibInit() should
be checking for TLS_OUT_OF_INDEXES, not the 0x (decimal
4294967295, the maximum size of a 32-bit DWORD) it's been checking for
since ancient times.  That's probably a small bug, but it's not the
cause of the problems here.


WinDbug output:
--
0:001> .frame 7
07 `0062f150 07fe`ed490aae nsthread!Ns_ThreadSelf+0x89 
[Z:\src\web\ns-fork-pub\naviserver\nsthread\winthread.c @ 848]
0:001> dt wPtr
Local var @ 0x62f170 Type WinThread*
0x`004706d0 
   +0x000 nextPtr  : (null) 
   +0x008 wakeupPtr: (null) 
   +0x010 self : (null) 
   +0x018 event: 0x`015c Void
   +0x020 condwait : 0n0
   +0x028 slots: [100] (null) 
0:001> dt threadPtr
Local var @ 0x62f190 Type Ns_Thread_**
0x`0062f208 
 -> (null) 

0:001> ? tlskey
Evaluate expression: 8791677299988 = 07fe`f8cd6d14
0:001> .formats tlskey
Evaluate expression:
  Hex: 07fe`f8cd6d14
  Decimal: 8791677299988
  Octal:   000177737063266424
  Binary:    0111 1110 1000 11001101 01101101 
00010100
  Chars:   ..m.
  Time:Thu Jan 11 00:12:47.729 1601 (UTC - 4:00)
  Float:   low -3.33323e+034 high 2.86706e-042
  Double:  4.34367e-311
0:001> dt -v tlskey
Got address 07fef8cd6d14 for symbol
nsthread!tlskey
7

0:001> kb
: Call Site
: ucrtbased!issue_debug_notification+0x45 
[minkernel\crts\ucrt\src\appcrt\internal\report_runtime_error.cpp @ 28]
: ucrtbased!__acrt_report_runtime_error+0x13 
[minkernel\crts\ucrt\src\appcrt\internal\report_runtime_error.cpp @ 154]
: ucrtbased!abort+0x1d [minkernel\crts\ucrt\src\appcrt\startup\abort.cpp @ 61]
: ucrtbased!common_assert_to_stderr_direct+0xe5 
[minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 161]
: ucrtbased!common_assert_to_stderr+0x27 
[minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 179]
: ucrtbased!common_assert+0x68 
[minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 420]
: ucrtbased!_wassert+0x2f [minkernel\crts\ucrt\src\appcrt\startup\assert.cpp @ 
444]
: nsthread!Ns_ThreadSelf+0x89 
[Z:\src\web\ns-fork-pub\naviserver\nsthread\winthread.c @ 848]
: libnsd!NsTclThreadObjCmd+0x42e 
[Z:\src\web\ns-fork-pub\naviserver\nsd\tclthread.c @ 238]
: tcl86t!TclNRRunCallbacks+0x63
: tcl86t!Tcl_EvalEx+0x9dd
: tcl86t!Tcl_FSEvalFileEx+0x223
: tcl86t!Tcl_MainEx+0x4be
: libnsd!CmdThread+0x6e [Z:\src\web\ns-fork-pub\naviserver\nsd\nsmain.c @ 1333]
: nsthread!NsThreadMain+0x77 
[Z:\src\web\ns-fork-pub\naviserver\nsthread\thread.c @ 236]
: nsthread!ThreadMain+0x6c 
[Z:\src\web\ns-fork-pub\naviserver\nsthread\winthread.c @ 880]
: ucrtbased!thread_start+0x9c 
[minkernel\crts\ucrt\src\appcrt\startup\thread.cpp @ 97]
: kernel32!BaseThreadInitThunk+0xd
: ntdll!RtlUserThreadStart+0x1d
--

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-05 Thread Andrew Piskorski
On Thu, Jun 04, 2020 at 04:55:10PM -0400, Andrew Piskorski wrote:

> Yes, with your new change, when running ns_thread.test on Windows I
> now always get this:
> 
>   Assertion failed: tid != NULL, file tclthread.c, line 238

A bunch of different tests seem to trigger that assertion failure.
However, it does seem to be the only thing in the tests that causes
crashes, which is good.  In my latest code here, I used the new
"notWin32" tcltest contstraint to turn off all the tests that tend to
trigger that assertion:

  https://bitbucket.org/apiskors/naviserver/commits/

That let's me run the rest of the regression tests to completion, with
the summary results below.

Is there someplace I should upload or attach the full test output?
It's about 5k lines and 3 megabytes.


Tests ended at Fri Jun 05 13:32:03 EDT 2020
all.tcl:Total   1569Passed  1376Skipped 39  Failed  154
Sourced 70 Test Files.
Files with failing tests: encoding.test http.test http_byteranges.test 
http_chunked.test http_keep.test ns_adp_compress.test ns_base64.test 
ns_driver.test ns_hostbyaddr.test ns_httptime.test ns_info.test ns_log.test 
ns_proxy.test ns_schedule.test ns_time.test ns_urlencode.test tclconnio.test 
tclresp.test
Number of tests skipped for each constraint:
2   binaryMismatch
5   curl
2   knownBug
1   notDarwin
28  notWin32
1   stress

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-05 Thread Andrew Piskorski
On Fri, May 15, 2020 at 10:37:15AM +0200, Gustaf Neumann wrote:
> On 15.05.20 09:21, Andrew Piskorski wrote:

> > Previously on Windows I was running NaviServer code from
> > c. 2019-07 and an ancient Microsoft compiler from 2010; the problem
> > did NOT happen then.

> Can you try with the released version 4.99.17 (2018-11-04)
> with your new Windows environment?

It can be tricky to find an old version of the NaviServer code that
builds correctly on Windows.  I did successfully build these two older
points in the code:

  commit c31d3a0c4ef60b79c542cacbdc66c9cb53428faa
  Author: Gustaf Neumann 
  Date:   2019-06-18 20:43:41 +0200 Tue
  fix prototype of Ns_SockListenCallback in in ns.h (many thanks to Maurizio 
Martignano)

  commit 83e8c50a38a6986f3c0468b69e8ef3abd68f926e
  Author: Gustaf Neumann 
  Date:   2020-01-17 21:03:31 +0100 Fri
  improve spelling

For each of those, first I did a "git checkout VERSION" to the commit
version number above.  Then I copied the latest makefiles and tests on
top of the old code like so:

  cp -p  $NEW_DIR/Makefile.win32 .
  cp -p  $NEW_DIR/include/Makefile.* include/
  cp -pr $NEW_DIR/win32-util .
  cp -pr $NEW_DIR/tests .

With that, those two older codebases both compiled on Windows.
However, when I then ran the latest regression tests, NaviServer
crashed with:

  Run-Time Check Failure #2 - Stack around the variable 'spoolLimit' was 
corrupted.
  (Press Retry to debug the application)

That terminated the tests early, of course.  It looks like about 93
tests passed and 42 failed before the testing NaviServer crashed.
Many of the failed tests did look like the same ones failing on the
latest head code.

The variable spoolLimit only appears in "nsd/tclhttp.c", so presumably
one of the later commits fixed a bug there.  But at that point I gave
up trying to test the older code.

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-04 Thread Andrew Piskorski
On Thu, Jun 04, 2020 at 05:26:04PM +0200, Gustaf Neumann wrote:
> >Assertion failed: (addr != ((void *)0)), file tclobj.c, line 325

> Probably "Ns_ThreadSelf();" does not work under windows (get the
> id of the current thread). Ns_ThreadSelf() is defined in the OS specific
> part (winthread.c). The exception is probably coming from
> test thread-2.3, it looks to me as if the the thread (here the
> thread running the tests) is not properly initiated under windows.
> 
> i have added one more assert, to make it easier to pinpoint the
> problem.

Yes, with your new change, when running ns_thread.test on Windows I
now always get this:

  Assertion failed: tid != NULL, file tclthread.c, line 238

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-04 Thread Gustaf Neumann

On 03.06.20 21:13, Andrew Piskorski wrote:

ns_thread.test
   [03/Jun/2020:14:25:28][4844.13bc][-tcl-nsthread:7-] Notice: update 
interpreter to epoch 1, trace none, time 0.219973 secs
   Assertion failed: (addr != ((void *)0)), file tclobj.c, line 325
   [03/Jun/2020:14:25:32][4844.1dd8][-tcl-nsthread:8-] Notice: update 
interpreter to epoch 1, trace none, time 3.902536 secs

The stack trace looked like this:

   ucrtbased.dll!07feedad41cf()Unknown
   libnsd.dll!Ns_TclSetAddrObj(Tcl_Obj * objPtr, const char * type, void * 
addr) Line 325  C
   libnsd.dll!NsTclThreadObjCmd(void * clientData, Tcl_Interp * interp, int 
objc, Tcl_Obj * const * objv) Line 239 C
   [External Code]
   libnsd.dll!CmdThread(void * arg) Line 1333  C
   nsthread.dll!NsThreadMain(void * arg) Line 236  C
   nsthread.dll!ThreadMain(void * arg) Line 874C

So that was inside Ns_TclSetAddrObj(), probably in the
"NS_NONNULL_ASSERT(addr != NULL);" line.  It was called from
NsTclThreadObjCmd(), in "case THandleIdx", line 238 in tclthread.c.
That presumably came from a Tcl "ns_thread handle" call, and there's
only one of those in the test suite, "test ns_thread-2.6" on line 70
of "ns_thread.test".  But I don't understand why that would throw a
null pointer exception!


Probably "Ns_ThreadSelf();" does not work under windows (get the
id of the current thread). Ns_ThreadSelf() is defined in the OS specific
part (winthread.c). The exception is probably coming from
test thread-2.3, it looks to me as if the the thread (here the
thread running the tests) is not properly initiated under windows.

i have added one more assert, to make it easier to pinpoint the
problem.


 ns_listencallback-1.0 register FAILED
 Contents of test case:


This is again one of these low-level socket commands.


The ns_schedule-2.1 failure certainly sounds related to my original
problem of the scheduler thread getting stuck, but there's enough else
going on here that don't have any idea where the real source of the
problem might be.


This sounds indeed related with the original problem.
The test registers a repeating proc (interval 1s),
but within in the time-range of 2.5s, it is executed
only once.

On 03.06.20 23:41, Andrew Piskorski wrote:

Weirdly, that stacktrace seems like it must be missing some
intermediate function calls, because nsproxy's Ns_ModuleInit()
definitely never calls Ns_IncrTime() DIRECTLY.  So I'm not sure what's
going on there either.

this is typical, when the code is compiled with an optimizer. Try to
deactivate the optimizer, this will improve the feedback.

maybe i get on the weekend some access to a win environent.

-gn



___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-04 Thread Ibrahim Tannir




On 03-Jun-20 23:41, Andrew Piskorski wrote:

Is nsproxy supposed to work correctly on Windows?


I had to make extensive changes in the nsproxy code to make it 
work on Windows. The code is Unix-centric and makes some false 
assumptions w.r.t. to Windows handles v.s. unix file descriptors 
and therefore cannot run in Windows - at least not with native 
MSDN libraries.


I didn't push my changes back into the repository yet, since the 
changes need to possibly be readjusted and retested for Unix. I 
will ask Zoran to peer-review, adjust and push my changes, 
however this will not be before next week.


Cheers,
Ibrahim




___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-03 Thread Andrew Piskorski
Is nsproxy supposed to work correctly on Windows?

Its test framework wants to use test-nsproxy.sh to set LD_LIBRARY_PATH,
which of course can't work on Windows.  But as I work around that,
when I run just the ns_proxy.test tests, I get this error:

  Assertion failed: sec >= 0, file time.c, line 344

Which gives this stacktrace when run under the WinDbg debugger:

: nsthread!Ns_IncrTime+0x6c [naviserver\nsthread\time.c @ 344]
: nsproxy!Ns_ModuleInit+0x7a76
: nsthread!NsThreadMain+0x77 [naviserver\nsthread\thread.c @ 236]
: nsthread!ThreadMain+0x6c [naviserver\nsthread\winthread.c @ 874]
: ucrtbased!thread_start+0x9c 
[minkernel\crts\ucrt\src\appcrt\startup\thread.cpp @ 97]
: kernel32!BaseThreadInitThunk+0xd
: ntdll!RtlUserThreadStart+0x21

Weirdly, that stacktrace seems like it must be missing some
intermediate function calls, because nsproxy's Ns_ModuleInit()
definitely never calls Ns_IncrTime() DIRECTLY.  So I'm not sure what's
going on there either.

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-03 Thread Andrew Piskorski
On Mon, Jun 01, 2020 at 11:08:48AM -0400, Andrew Piskorski wrote:
> ## Last test that runs on Windows, it locks up forever:
> 
> http_persistent.test
> ns_sockioctl failed: no such file or directory
> while executing
> "ns_socknread $s"
> (procedure "client_readable" line 2)

For now I simply moved the entire http_persistent.test file out of the
way, so the server skips those tests.  With that, test server got
further, but eventually crashed with what looks like a null pointer
dereference, here:

  [03/Jun/2020:14:25:28][4844.2b10][-conn:test:default:1:229-] Notice: inside 
the filter 3.4
  ns_serverpath.test
  ns_set.test
  ns_sha1.test
  ns_sls.test
  ns_striphtml.test
  ns_thread.test

  [03/Jun/2020:14:25:28][4844.13bc][-tcl-nsthread:7-] Notice: update 
interpreter to epoch 1, trace none, time 0.219973 secs
  Assertion failed: (addr != ((void *)0)), file tclobj.c, line 325
  [03/Jun/2020:14:25:32][4844.1dd8][-tcl-nsthread:8-] Notice: update 
interpreter to epoch 1, trace none, time 3.902536 secs

The stack trace looked like this:

  ucrtbased.dll!07feedad41cf()Unknown
  libnsd.dll!Ns_TclSetAddrObj(Tcl_Obj * objPtr, const char * type, void * addr) 
Line 325  C
  libnsd.dll!NsTclThreadObjCmd(void * clientData, Tcl_Interp * interp, int 
objc, Tcl_Obj * const * objv) Line 239 C
  [External Code]
  libnsd.dll!CmdThread(void * arg) Line 1333  C
  nsthread.dll!NsThreadMain(void * arg) Line 236  C
  nsthread.dll!ThreadMain(void * arg) Line 874C

So that was inside Ns_TclSetAddrObj(), probably in the
"NS_NONNULL_ASSERT(addr != NULL);" line.  It was called from
NsTclThreadObjCmd(), in "case THandleIdx", line 238 in tclthread.c.
That presumably came from a Tcl "ns_thread handle" call, and there's
only one of those in the test suite, "test ns_thread-2.6" on line 70
of "ns_thread.test".  But I don't understand why that would throw a
null pointer exception!

Prior to that crash, various other interesting test failures cropped
up, including both "ns_listencallback-1.0" and "ns_schedule-2.1" below.
The ns_schedule-2.1 failure certainly sounds related to my original
problem of the scheduler thread getting stuck, but there's enough else
going on here that don't have any idea where the real source of the
problem might be.


 ns_listencallback-1.0 register FAILED
 Contents of test case:

set localhost [expr {[ns_info ipv6] ? "::1" : "127.0.0.1"}]
ns_log notice "open sockent on $localhost 7227"
set fds [ns_sockopen $localhost 7227]
lassign $fds rfd wfd
set size 0

if {[gets $rfd line] == -1} {
ns_log error "got no data"
} else {
incr size [string length $line]
puts $wfd "How are you?"
flush $wfd
gets $rfd line
incr size [string length $line]
}
return [list size $size]

 Result was:
size 0
 Result should have been (exact matching):
size 46
 ns_listencallback-1.0 FAILED


 ns_schedule-2.1 schedule proc: interval FAILED
 Contents of test case:

set id [ns_schedule_proc 1s {nsv_lappend . . ns_schedule-2.1}]
ns_sleep 2.5s
ns_unschedule_proc $id
nsv_get . .

 Result was:
ns_schedule-2.1
 Result should have been (glob matching):
ns_schedule-2.1 ns_schedule-2.1*
 ns_schedule-2.1 FAILED

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-01 Thread Andrew Piskorski
On Mon, Jun 01, 2020 at 11:08:48AM -0400, Andrew Piskorski wrote:

>  encoding-1.1 Send body with ns_return and charset utf-8 FAILED

>  errorInfo: select failed: no such file or directory
> invoked from within
> "nstest::http-0.9 -encoding utf-8 -getbody 1 -getheaders {Content-Type 
> Content-Length} GET "/encoding""

There are 7 different versions of the encoding.* page present.  If I
start up the test server and then ask it for the FULL URL of any one
of those files, like "encoding.utf2iso_adp", it works fine!  But if I
just ask for "encoding" without the extension it fails.

So hitting this URL works fine:
  http://localhost:8000/encoding.utf_adp
But this fails with 404 Not Found:
  http://localhost:8000/encoding

I see that test.nscfg has what look like appropriate "ns/mimetypes"
and "ns/encodings" sections, and of course that same config file works
fine on Linux.  So what could be going wrong on my Windows box to
break the mapping of "/encoding" to "/encoding.utf_adp"?

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-06-01 Thread Andrew Piskorski
On Fri, May 15, 2020 at 10:37:15AM +0200, Gustaf Neumann wrote:
> does the regression test run ok?

Good question.  Unfortunately, I'd never run the regression tests on
Windows before.  I now have them set up to run, however, I get LOTS of
failures, and I don't know if these are real problems with NaviServer,
or something wrong with my testing setup.  Either way, I would like to
track it down so I can rely on running these same regression tests on
Windows as on Linux.

Are the tests in "naviserver/tests/all.tcl" supposed to work correctly
on Windows too?  Is anyone else successfully running these tests there?

On Windows, I always get immediate failures due to return codes of 1.
The first such failure is "encoding-1.1", output shown below.

More concerning, is that once it gets to "http_persistent.test", the
whole NaviServer process locks up and never gets any farther.  So any
tests that come AFTER that one are not being run at all.  I've left
the test NaviServer running overnight just to be sure, and after that
point there's never any more output until I hit Ctrl-c to shut it down.

So far I've tested the nearly latest NaviServer head code on Windows 7
(no Windows 10 yet), with both the old 2010 and newer 2019 Microsoft
compilers.  The behavior of the regression tests appears identical in
both cases.  I have not yet tested older versions of NaviServer.  On
Linux these tests all run fine, of course.

On Windows, I can invoke "tests/all.tcl" either before or after
installing NaviServer.  Test behavior appears to be the same in both
cases.  Before installing, I start the tests like this:

  nmake -f Makefile.win32 _test

For that to work, you need these small patches to Makefile.win32:

  
https://bitbucket.org/apiskors/naviserver/commits/7d7e245f8451419de3ac9b1d6202e5f26c883fdd


## First test to fail on Windows:

 encoding-1.1 Send body with ns_return and charset utf-8 FAILED
 Contents of test case:

nstest::http-0.9 -encoding utf-8 -getbody 1 -getheaders {Content-Type 
Content-Length} GET "/encoding"

 Test generated error; Return code was: 1
 Return code should have been one of: 0 2
 errorInfo: select failed: no such file or directory
invoked from within
"nstest::http-0.9 -encoding utf-8 -getbody 1 -getheaders {Content-Type 
Content-Length} GET "/encoding""
("uplevel" body line 2)
invoked from within
"uplevel 1 $script"
 errorCode: NONE
 encoding-1.1 FAILED


## Last test that runs on Windows, it locks up forever:

http_persistent.test
ns_sockioctl failed: no such file or directory
while executing
"ns_socknread $s"
(procedure "client_readable" line 2)
invoked from within
"client_readable 1000 $s"
(procedure "tcltest::client_receive" line 2)
invoked from within
"tcltest::client_receive sock05DA6CE0"

-- 
Andrew Piskorski 


___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] scheduler thread getting stuck

2020-05-15 Thread Gustaf Neumann

On 15.05.20 09:21, Andrew Piskorski wrote:

Recently I started seeing some weird behavior from NaviServer that
I've never seen before.  From time to time, it looks like the
scheduler thread is getting stuck and not running anything, often for
hours at a time.  Then, on rare occasions, it will inexplicably come
unstuck and go back to normal.

With what exact version happens this?
does the regression test run ok?

So far I've ONLY seen this strange behavior on Windows 7, where I
recently upgraded to newer NaviServer code and a newer Microsoft 2019
Visual Studio Community Edition compiler.  I suspect the problem
doesn't happen on Linux at all, but I haven't checked for that
thoroughly.

This is the first report of this kind. My suspicion is as well
that it has to do with Windows and the used compiler mix.

Previously on Windows I was running NaviServer code from
c. 2019-07 and an ancient Microsoft compiler from 2010; the problem
did NOT happen then.

Can you try with the released version 4.99.17 ( 2018-11-04)
with your new Windows environment?

-gn



___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel