Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Stephen Deasey
On Tue, Dec 4, 2012 at 5:21 PM, Gustaf Neumann neum...@wu.ac.at wrote:


 * Only content sent via Ns_ConnWriteVChars has the chance to get
 compressed.


ie. dynamic content with a text/* mime-type. The idea here was you don't
want to try and compress gifs an so on, and static content could be
pre-compressed on disk - at runtime simple look for a *.gz version of the
content.

This could be cleaned up a bit by:

- having an extendable white-list of mime-types which should be compressed:
text/*, application/javascript, application/xml etc.

- we should actually ship some code which searches for *.gz versions of
static files



 * Similarly, range requests are not handled when the data is not sent
 ReturnOpen to the writer Queue.


The diagram shows Ns_ConnReturnData also calls ReturnRange, and hence the
other leg of fastpath and all the main data sending routines should handle
range requests.




 * there is quite some potential to simplify / orthogonalize the servers
 infrastructure.
 * improving this structure has nothing to do with
 naviserver-connthreadqueue, and should happen at some time in the main tip.



The writer thread was one of the last bits of code to land before things
quietened down, and a lot of the stuff that got talked about didn't get
implemented. One thing that was mentioned was having a call-back interface
where you submit a function to the writer thread and it runs it. This would
allow other kinds of requests to be served async.

One of the things we've been talking about with the connthread work is
simplification. The current code, with it's workarounds for stalls and
managing thread counts is very complicated. If it were simplified and
genericised it could also be used for background writer threads, and SSL
read-ahead threads (as in aolserver  4.5). So, that's another +1 for
keeping the conn threads simple.
--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Stephen Deasey
On Mon, Dec 3, 2012 at 10:38 AM, Gustaf Neumann neum...@wu.ac.at wrote:

 All changes are on bitbucket (nsssl and naviserver-connthreadqueue).


I found this nifty site the other day:

https://www.ssllabs.com/ssltest/analyze.html?d=next-scripting.org

It's highlighting a few things that need fixed in the nsssl module,
including a couple of security bugs. Looks like relatively little code
though.

Also, there's this:

https://insouciant.org/tech/ssl-performance-case-study/

which is a pretty good explanation of things from a performance point
of view. I haven't spent much time looking at SSL. Looks like there
could be some big wins. For example, some of the stuff to do with
certificate chains could probably be automated - the server could spit
out an informative error to the log if things look poorly optimised.

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Stephen Deasey
On Thu, Nov 29, 2012 at 6:51 PM, Gustaf Neumann neum...@wu.ac.at wrote:

 It turned out
 that the large queueing time came from requests from taipeh, which contained
 several 404 errors. The size of the 404 request is 727 bytes, and therefore
 under the writersize, which was configured as 1000. The delivery of an error
 message takes to this site more than a second. Funny enough, the delivery of
 the error message blocked the connection thread longer than the delivery of
 the image when it is above the writersize.

 I will reduce the writersize further, but still a slow delivery can even
 slow down the delivery of the headers, which happens still in the connection
 thread.

This shouldn't be the case for strings, or data sent from the fast
path cache, such as a small file (a custom 404), as eventually those
should work their way down to Ns_ConnWriteData which will construct
the headers if not already sent and pass them, along with the data
payload to writev(2). Linux should coalesce the buffers and send in a
single packet, if small enough.

I wonder if this is some kind of weird nsssl interaction.

(For things like sendfile without ssl we could use TCP_CORK to
coalesce the headers with the body)

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Gustaf Neumann
Am 04.12.12 20:25, schrieb Stephen Deasey:
 I found this nifty site the other day:

  https://www.ssllabs.com/ssltest/analyze.html?d=next-scripting.org

 It's highlighting a few things that need fixed in the nsssl module,
 including a couple of security bugs. Looks like relatively little code
 though.
The report is already much better: now everything is green. Most of the 
complaints could be removed via configuration, just two issues required 
code changes (requires a flag, which is not available in all current 
OpenSSL implementation, such as that from Mac OS X, and adding a 
callback). The security rating is now better than from nginx.

Today, i was hunting another problem in connection with nsssl, which 
turns out to be a weakness of our interfaces. The source for the problem 
is that the buffer management of OpenSSL is not aligned with the buffer 
management in naviserver. In the naviserver driver, all receive requests 
are triggered via the poll, when sockets are readable. With OpenSSL it 
might be as well possible that data as a leftover from an earlier 
receive when a smaller buffer is provided. Naviserver requested during 
upload spool reveive operations with a 4KB buffer. OpenSSL might receive 
at once 16KB. The read operation with the small buffer will not drain 
the OpenSSL buffer, and later, poll() will not be triggered by the fact, 
that the socket is readable (since the buffer is still quite full). The 
problem happened in NaviServer, when the input was spooled (e.g. file 
uploads). I have doubts that this combination ever worked. I have 
corrected the problem by increasing the buffer variable in the driver.c. 
The cleaner implementation would be to add an Ns_DriverReadableProc 
Readable  similar to the Ns_DriverKeepProc Keep, but that would 
effect the interface of all drivers.

-gustaf neumann


--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Gustaf Neumann

Am 04.12.12 20:06, schrieb Stephen Deasey:
- we should actually ship some code which searches for *.gz versions 
of static files
this would mean to keep a .gz version and a non-.gz version in the file 
system for the cases, where gzip is not an accepted encoding. Not sure, 
i would like to manage these files and to keep it in sync the 
fast-path cache could keep gzipped copies, invalidation is already there.


* Similarly, range requests are not handled when the data is not
sent ReturnOpen to the writer Queue.


The diagram shows Ns_ConnReturnData also calls ReturnRange, and hence 
the other leg of fastpath and all the main data sending routines 
should handle range requests.
this path is ok. when neither mmap or cache is set, fastpath can call 
ReturnOpenFd, and ReturnOpen send the data blindly to the writer if 
configured, which does not handle ranges. This needs some refactoring.


* there is quite some potential to simplify / orthogonalize the
servers infrastructure.
* improving this structure has nothing to do with
naviserver-connthreadqueue, and should happen at some time in the
main tip.


The writer thread was one of the last bits of code to land before 
things quietened down, and a lot of the stuff that got talked about 
didn't get implemented.


i am not complaining, just trying to understand the historical layers. 
Without the call-graph the current code is hard to follow.


One thing that was mentioned was having a call-back interface where 
you submit a function to the writer thread and it runs it. This would 
allow other kinds of requests to be served async.


One of the things we've been talking about with the connthread work is 
simplification. The current code, with it's workarounds for stalls and 
managing thread counts is very complicated. If it were simplified and 
genericised it could also be used for background writer threads, and 
SSL read-ahead threads (as in aolserver  4.5). So, that's another +1 
for keeping the conn threads simple.
The code in naviserver-connthreadqueue handles already read-aheads with 
SSL. i have removed there these hacks already; i think, these were in 
part responsible for the sometimes erratic response times with SSL.


-gustaf neuamnn



--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Stephen Deasey
On Tue, Dec 4, 2012 at 10:55 PM, Gustaf Neumann neum...@wu.ac.at wrote:

 The code in naviserver-connthreadqueue handles already read-aheads with SSL.
 i have removed there these hacks already; i think, these were in part
 responsible for the sometimes erratic response times with SSL.

Well, I think the thing here is one-upon-a-time SSL was considered
computationally expensive (I don't know if it still is, with recent
Intel cpus having dedicated AES instructions etc.). Read-ahead is good
because you don't want an expensive conn thread waiting around for the
whole request to arrive, packet by packet. But with SSL the single
driver thread will be decrypting read-ahead data for multiple sockets
and may run out of cpu, stalling the request pipeline, starving the
conn threads. By making the SSL driver thread non-async you lose out
on read-ahead as that all happens on the conn thread, but you gain cpu
resources on a multi-cpu system (all of them, today). AOLserver 4.5
added a pool of read-ahead threads, one per-socket IIRC, to keep the
benefits of read-ahead while gaining cpu parallelism.

- does a single driver thread have enough computational resources to
decrypt all sockets currently in read-ahead? This is going to depend
on the algorithm. Might want to favour AES if you know your cpu has
support.
- which is worse, losing read-ahead, or losing cpu-parallelism?
- if a read-ahead thread-pool is added, should it be one thread
per-socket, which is simple, or one thread per-cpu and some kind of
balancing mechanism?

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Stephen Deasey
On Wed, Nov 28, 2012 at 10:38 AM, Gustaf Neumann neum...@wu.ac.at wrote:

 It is interesting to see, that with always 5 connections threads running and
 using jemalloc, we see a rss consumption only slightly larger than with
 plain tcl and zippy malloc having maxthreads == 2, having less requests
 queued.

 Similarly, with tcmalloc we see with minthreads to 5, maxthreads 10

requests 2062 spools 49 queued 3 connthreads 6 rss 376
requests 7743 spools 429 queued 359 connthreads 11 rss 466
requests 8389 spools 451 queued 366 connthreads 12 rss 466

 which is even better.

Min/max threads 5/10 better than 2/10? How about 7/10? When you hit
10/10 you can delete an awful lot of code :-)

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel


Re: [naviserver-devel] naviserver with connection thread queue

2012-12-04 Thread Gustaf Neumann
Am 05.12.12 00:41, schrieb Stephen Deasey:
 On Wed, Nov 28, 2012 at 10:38 AM, Gustaf Neumann neum...@wu.ac.at wrote:
 It is interesting to see, that with always 5 connections threads running and
 using jemalloc, we see a rss consumption only slightly larger than with
 plain tcl and zippy malloc having maxthreads == 2, having less requests
 queued.

 Similarly, with tcmalloc we see with minthreads to 5, maxthreads 10

 requests 2062 spools 49 queued 3 connthreads 6 rss 376
 requests 7743 spools 429 queued 359 connthreads 11 rss 466
 requests 8389 spools 451 queued 366 connthreads 12 rss 466

 which is even better.
 Min/max threads 5/10 better than 2/10?
the numbers show that 5/10 with tcmalloc is better than 5/10 with 
jemalloc and only slghtly worse than 2/2 with zippymalloc.

-gn

--
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d
___
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel