Dear friends,

again, an udpate of naviserver-connthreadqueue:

(1) Observing the traffic on the low-traffic-site next-scripting.org 
showed still sometimes surprisingly slow "response times" where the pure 
runtime (not including queuing time etc) was 28 seconds. Under normal 
conditions, the response time is just a fraction of a second. It turned 
out that these requests are from a mobile broadband service, with a low 
bandwidth.

92.40.253.47 - - [12/Dec/2012:22:19:14 +0100] "GET /2.0b3/doc/nx HTTP/1.1" 200 
23406 "https://next-scripting.org/2.0b3/doc/xotcl2"; "Mozilla/5.0 (X11; Ubuntu; 
Linux i686; rv:17.0) Gecko/20100101 Firefox/17.0" 28.700910 "1355347126
.164479 14.759801 0.000122 0.007112 28.693676"

Since this page is an adp-page, which has in the current naviserver no 
chance to be delivered via the writer thread (as shown by the callgraph 
posted earlier).

Maybe your first reaction is "don't care, bandwidths become better", but 
the problem is quite serious. It is very easy to start an DOS attack by 
starting just a few requests with a low bandwidth. If the server has 
e.g. a max of 10 connection threads defined, just 10 slow requests to 
adp-pages bring the server to a halt for arbitrary long times, although 
the server is computationally able to handle several hundred/thousand of 
these adp-requests per second. The attacker has just to accept a few 
bytes from time to time to stay above the write timeout of the server. 
Browsing around shows that this is a well-known attack that affects as 
well apache 1.x and 2.x, but not e.g. nginx, which is fully asynchronous 
and performs usually request spooling/buffering in front of the back-end.

So, asynchronous receives and deliveries are a not only a nice feature. 
Therefore, i have added an interface between the "string based" delivery 
API and the writer thread and have this running on next-scripting.org 
since a few days, everything seems to work flawless, no single long 
request blocking happened.


(2) We started using naviserver-connthreadqueue on our production site a 
few days ago (which runs behind nginx, therefore it profits just in part 
form the changes). Naviserver sees there currently about 1.5 mio 
page-views per day.

The experiences are:

   - the config file needs some tweaking to keep queuing
     times low. i have set minthreads to 7 (before we had 3,
     but with a different interpretation).

   - the new async log writer works nicely, although it might
     reverse the order of entries in the log file. Will look into that.

   - the writer thread was not used so far, but had some troubles:

     (a) we saw peaks of  600.000 mutex locks/second, most
           of these from a mutex of the writer thread. Under
           normal conditions, we see on avg 8k mutex locks/second.

     (b) "ns_writer list" was not thread-safe (it crashed).

     While looking into problem (a), the writer thread did not
     have a clear EOF-handling (added POLLHUP handling)
     and it was possible that the writer might release a socket
     structure while the driver still depended on it. Now the
     lifecycle management in all in the driver, the problem
     is gone. Also (b) is fixed by now.

  - there are much less thread creates, the memory consumption
    seems better, but we need some longer measurement.
    The average response time might be slightly better, but it
    is within the daily variation range. Since we have no data
    about the queuing time of the old server, this is still hard
    to compare. We have still to lower the debug output, so,
    it is too early for an assessment.

(3) Concerning zip-delivery of files: I have refactored the
    code a little to ease the zip handling on the tcl layer.
    There is now a new subcommand "ns_conn zipaccepted"
    which performs the rather complex preference rules.
    The following code could be easily used e.g. in a filter,
    or it can be extended to "compile" different formats into
    a target format.  At least something to play in a first
    step

i will commit the changes asap to naviserver-connthreadqueue,
clean up, document, etc....

all the best.
-gustaf neumann

set file graph.js
set fn [ns_info pageroot]/$file
set mime [ns_guesstype $file]

if {[ns_conn zipaccepted] && [ns_set iget [ns_conn headers] Range] eq ""} {
     if {![file readable $fn.gz] || [file mtime $fn] > [file mtime $fn.gz]} {
        exec gzip -9 <  $fn > $fn.gz
     }
     if {[file readable $fn.gz]} {
        set fn $fn.gz
        ns_set put [ns_conn outputheaders] Vary Accept-Encoding
        ns_set put [ns_conn outputheaders] Content-Encoding gzip
     }
}
ns_returnfile 200 $mime $fn





Am 09.12.12 19:48, schrieb Gustaf Neumann:
> Dear all,
>
> On the link below i have tried to summarize the changes in the 
> naviserver-connthreadqueue fork taken from several mails. The summary 
> contains as well a few charts showing the stepwise improvements, which 
> i hope, someone might find interesting.
>
> https://next-scripting.org/xowiki/docs/misc/naviserver-connthreadqueue
>
> The new version uses TCP_CORK for the most interesting cases. The 
> changes from this feature are not dramatical, since NaviServer is 
> often aggregating the strings to be written in DStrings, so there are 
> apparently not many small writes.
>
> If nobody objects, i would tag the current tip of naviserver with 
> 4.99.4 and move the changes over to the main repository in the near 
> future .... after i make an iteration of the affected documentation.
>
> -gustaf neumann


------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to