Recently I started seeing some weird behavior from NaviServer that
I've never seen before.  From time to time, it looks like the
scheduler thread is getting stuck and not running anything, often for
hours at a time.  Then, on rare occasions, it will inexplicably come
unstuck and go back to normal.

I never see any "scheduled proc took too long" sorts of warnings in
the log, so I don't THINK I'm running Tcl code that's tying up the
scheduler thread.  So far it seems more like the scheduler itself
silently just does nothing.

Interestingly, when the scheduler comes unstuck, any "missed"
ns_schedule_daily jobs suddenly run all at once, in one second!  In my
case that includes the ns_logroll I schedule daily at 23:59.  Just
yesterday I saw it finally come unstuck and roll the serer log at
16:45 the next day!

So far I've ONLY seen this strange behavior on Windows 7, where I
recently upgraded to newer NaviServer code and a newer Microsoft 2019
Visual Studio Community Edition compiler.  I suspect the problem
doesn't happen on Linux at all, but I haven't checked for that
thoroughly.  Previously on Windows I was running NaviServer code from
c. 2019-07 and an ancient Microsoft compiler from 2010; the problem
did NOT happen then.

Do you have any advice on how I should go about trying to debug this?
E.g., are there commands I should run to check exactly what the
scheduler thread is doing at any given time?  I've started reading
"nsd/tclsched.c" and "nsd/sched.c", but I've never looked at this code
before and have no idea what might be wrong.

Btw, the sched.c source says it's based on the 1988 paper below, but
just how useful is it for understanding things?  Those old conference
proceedings seem hard to find outside of university libraries.  The
USENIX website itself says they only have content back to 1993!

(Thanks for any help and advice!)


https://www.usenix.org/legacy/publications/bibliography/byAuthor.html

Author: Ronald E. Barkley
Author: T. Paul Lee
Title: A Heap-based Callout Implementation to Meet Real-Time Needs
Pages: 213-222
Publisher: USENIX
Proceedings: USENIX Conference Proceedings
Date: Summer 1988
Location: San Francisco
Institution: AT&T Information Systems

The table of contents (nothing else) is online here:

  http://www.gbv.de/dms/tib-ub-hannover/303823399.pdf

-- 
Andrew Piskorski <a...@piskorski.com>


_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to