Stephen Deasey wrote:
> Interesting, but I wonder if we're not thinking this through
> correctly. My suggestion, and your here, and Gustaf's recet work are
> all aimed at refining the model as it currently is, but I wonder if
> we're even attempting to do the right thing?

Do we even know what the right thing is?  It could be any of
- maximize performance at any cost
- minimize resource usage
- adapt to dynamically changing workload
- minimize admin workload

And so forth.  I think there is no one-size-fits-all answer, but it 
should be possible, and hopefully easy, to get something that fits 
closely enough.

>> So I'm assuming that the available processing power - the number of
>> threads - should correlate to how busy the server is.  A server that is
>> 50% busy should have 50% of its full capacity working.
>
> But what is busy, CPU?  There needs to be an appropriate max number of
> threads to handle the max expected load, considering the capabilities
> of the machine. Too many and the machine will run slower. But why kill
> them when we're no longer busy?

I don't know how to precisely define, let alone measure busy, which is 
why I'm picking something that is readily measurable.

A thread is busy in the sense that it is unavailable to process new 
requests, but there are different reasons why it might be unavailable - 
either it's cpu bound, or it's blocking on something like a database 
call or i/o.  Different reasons for being busy might suggest different 
responses.

> - naviserver conn threads use a relatively large amount of memory
> because there tends to be one or more tcl interps associated with each
> one

Different requests could have different memory/resource requirements, 
which is a really nice thing about pools.  I'm hypothesizing that there 
are several different categories that requests fall into, based on the 
server resources needed to serve them and the time needed to complete them.

Server resources (memory) is either 'small' for requests that do not 
need a tcl interp (although tcl filters could tend to make this a 
nonexistent set), or 'big' for those that do.  Time is either slow or 
fast, by some arbitrary measure.

So a small/fast pool could be set up to serve static resources, a 
big/fast pool for non-database scripts, and a big/slow pool for database 
stuff.  I'm not sure what could be small/slow, maybe a c-coded proxy 
server or very large static files being delivered over slow connections.

The small/fast pool would only need a small number of threads with a 
high maxconnsperthread, while the large/slow pool might have many 
threads as most of those will be blocking on database access at any 
given time.

The important question in all of this is if a complex segmented setup 
like this works better in practice than a single large-enough pool of 
equal threads.  To which I don't have a good answer.

> - killing threads kills interps which frees memory
>
> But this is only useful if you can use the memory more profitably some
> where else, and I'm not sure you can.

Not only that, but memory isn't always released back to the system when 
free()d.  (vtamlloc is supposed to be able to, but I haven't had too 
much success with it so far.)  So freeing memory by shutting down 
threads won't necessarily make that available to your database.

However, memory that is not used by a process could be swapped out, 
making more physical ram available for other processes.  Having 20 
threads all used a little bit could keep them all in memory while having 
just 1 used a lot would keep a smaller working set.

This is a much bigger concern for low-resource systems.  Big systems 
nowadays have more physical memory than you can shake a stick at and 
swapping seems almost quaint.

> I think it might be better to drop min/max conn threads and just have
> n conn threads, always:

I've heard this recommendation before, in the context of tuning apache 
for high workloads - set maxservers=minservers=startservers.

I think it would make tuning easier for a lot of people if these was a 
basic "systemsize" parameter that is small/medium/large that set various 
other parameters to preset values.  As to what those values should be, 
that would take some thinking and experimentation.

> Thread pools are used throughout the server: multiple pools of conn
> threads, driver spool threads, scheduled proc threads, job threads,
> etc. so one clean way to tackle this might be to create a new
> nsd/pools.c which implements a very simple generic thread pool which
> has n threads, fifo ordering for requests, a tcl interface for
> dynamically setting the number of threads, and thread recycling after
> n requests. Then try to implement conn threads in terms of it.

I was thinking the exact same thing.

Sorry for the rambling/scattered thoughts, having a long commute does 
that :/

-J

------------------------------------------------------------------------------
WINDOWS 8 is here. 
Millions of people.  Your app in 30 days.
Visit The Windows 8 Center at Sourceforge for all your go to resources.
http://windows8center.sourceforge.net/
join-generation-app-and-make-money-coding-fast/
_______________________________________________
naviserver-devel mailing list
naviserver-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/naviserver-devel

Reply via email to