Ahmed Musa wrote:
.) retries (for LB workers)
-> At the Apache we use he prefork MPM. So how big is the connection_pool ?
Connection pool is per process. Unless you set a different pool size per
in the configuration, the following is true:
For Apache httpd we ask httpd "How many threads do you use for request
handling per process" and set the maximum pool size to this number. The
pool might be smaller is not all members are needed (depending on
configuration). For prefork MPM there is only one thtread per process,
so we set the pool size to 1, because more connections will never be
used (per process).
because a retry of a lb-worker happens if the loadbalancer can not get a free
connection for a member worker from the pool (Info from the doku).
That means: for httpd it should never happen. If it happens, we either
made a mistake in the code, or in our brain, or you are nut using httpd,
or you overwrite the pool size setting.
Does it depends on the Apache prefork Parameters MaxClients and
MaxRequestsPerChild ?
No, MaxClients for prefork is the maximum number of httpd processes
handling requests. For prefork MPM that means, you should not observe
more than MaxClients (+1) httpd processes, and each will have a
connection pool of size one, so you should never observe more than
MaxClients AJP13 connections going from this httpd to one of your backends.
MaxRequestsPerChild: is totally unrelated. It says: if a single httpd
child already answered this many requests, then stop it (and of course
the connection pool of size 1 belonging to this child will be closed).
If a new child gets spawned instead is determined depending on the httpd
spares/idle settings and the load. Why stop an httpd child: think about
a module with a memory leak. Then you want to throw away the httpd
processes every now and then and use new ones instead. It's not that
expensive, if you don't do it every second.
If it is so - we have MaxClients 500 and MaxRequestsPerChild 10000 => this
means the webserver can send/handle 5000000 requests ?
No, 500 *parallel* requests.
-> is this the size of our connection_pool? - i don't think so.
Size of pool: prefork => 1.
Nummber of connections: <= 1 * Number of httpd child processes <=
MaxClients = 500
On the other side we have 36 Tomcat instances - each Tomcat has - maxThreads=300
on the AJP connector. => ?this doesn't fit, or?
(And 3 Apache as frontend - all configured the same)
Doesn't fit. For one httpd in front, maxThreads=500 would be fine. For
more than one httpd on front, you don't want to increase the number of
threads in proportion, because most likely once you reach the 1.500
threads (for 3 httpd), something is wrong and your system won't really
be able to handle all those parallel requests (more likely: it wasn't
able to handle fewer requests, got slow and that's why you now have that
many requests waiting in parallel).
You could:
- use AJP connector, which will detach the connection on the backend
side from the thread, whenever the connection doesn't actually have a
request on it (waiting for the next request).
- use connectionTimeout on the Tomcat side and a connection poool
timeout on the JK side, s.t. idle connections get closed down and thus
Tomcat threads are freed.
- Use preferences between the httpd and Tomcat (36/3=12, so each httpd
gets 12 Tomcats with distance 0, and the other 24 with distance 1).
details depend on your availability concept (how you want to work with
redundancy and so on).
In the worker model i think the number of threads must correspond to the max
threads of the Tomcat - but how does it work in our prefork model?
Number of threads per httpd process = default connection pool size
Total number of connections <= Maximum Number of httpd processes *
Number of threads per httpd process
Relation etween Tomcat threads and connections: see above
.) Why does a load-balancer retries to get a free connection for a member
worker from the pool ? Why doesn't he use another member worker ?
Why LB retries: because we also support IIS and NSAPI, where we can't
determine the number of threads used per web server process with an
official web server API
Why dowsn't it use another member worker: it does, we call it failover.
The situations are different:
retry: we can't get a free connection. Should *never* happen on httpd
failover: we ran into a communication problem with the backend (network
problem, no response etc.)
.) reply_timeout - does it only work between the request and the first response
packet or between each two response packets. Is a response packet an AJP-packet
with 8k default size ?
Last question: yes, here a response packet is one from the JK point of
view (AJP packet). The size could be smaller though, e.g. if you flush()
in a servlet.
And: yes, the reply_timeout is checked between sending the request to
the backend and the fir4st response packet, but also between following
response packets. The overall time all packets take does not matter.
.) what is the socket_timeout good for ?
Socket timeouts are general timeouts when waiting for any kind of socket
event. The bad thing is, that on most OSes once a socket timeout fires
on a socket, you are not allowed to use the same socket any more.
We try to separate the socket_timeout from the
prepost/connect/connection_pool(_timeout) by using select before reading
from a socket. So those three timeouts should work totally independent
from the socket_timeout.
We configured a connection_timeout, a prepost_timeout and a reply_timeout => i
can't find a situation where i need an additional socket_timeout ?
When trying to establich a connection. Our connect_timeout gets used
*after* we connected. Then we send a cping/cpong packet and check, if we
get an answer faster than connect_timeout.
The TCP 3 way handshake during the TCP connect (SYN, SYN/ACK, ACK) is
done without any timeout, if you don't have a socket_timeout. Depending
on your OS settings and your network situation/problem, this can hang
for a couple of minutes (usually it does not, but you can consztruct
such situations).
And when i wants to know what happens in my system - i think i need a more "higher
level" failure message to evaluate the situation - but on socket level ?
JK basically detects network problems (and for those the additional info
level messages provided for any error level message should make pretty
clear what's happening) and timing problems. For the timing problems (if
you get them), you should switch on time measurement in your access logs
(if you've got big problems, then on the Apache httpd and Tomcat side -
both) so you get an idea, which requests produce trouble, how often,
which time, which client etc.
If you have lots of long running requests, you can take a couple of
thread dumps every now and then and you'll very likely captuere some of
those, so you can see, what they are actually doing (or waiting for).
.) this question concerns to the mod_jk options "retries" (for "normal" worker)
(hint - better to find an other Name - the same name for two different things makes problems when
writing about) in association with the recovery_options.
The same name is there (unfortunately) for historical reasons. Since we
don' want to break existing configs during the 1.2.x cycle, we can't
change the config names. The two different retries need to get renamed
once we start on JK3.
=> when i use the value 7 for the recovery_option - Bit 1+2+4 => i think a
retry is only possible if the connection timeout matches.
- not on the prepost_timout and not in the situation of reply_timeout => is
this right ?
Retry with connect_timeout *and* prepost_timeout. A prepost_timeout
let#s us send a test packet before each request. If Tomcat doesn't
immediately answer the test packet, we deduce, that it can't handle a
real request and do a failover. Since we didn't send any real request
data during the test packet cycle, we are free to use another member.
There is no risk of getting handled the request twice in this situation.
Another question to the same topic: i have a long running sticky session - this
means that in this session are many requests against the same Tomcat.
Will there be established a new connection for each request ? or will there be
used the established connection for all requests?
Connections are not related to sessions. If we detect, that the request
carries a session (jsessionid in URL or JSESSIONID cookie) and the
session ends with a routing name that corresponds to an lb worker name
or an lb worker route, we know, to which member we should send the request.
If we wanted to send it over the same connection, that the session used
last time, we would need to be able to let the request run on the same
httpd process the previous request of the same session used. But the
dispatch on the httpd processes is not done by JK and is already over.
But: there is no need to use the same connection. We only need to send
it to the same Tomcat.
Reuse of connections (no relation to stickyness): if a request should
get send to some Tomcat, and the process handling the request already
has an idle connection in the pool of the lb member used, it will reuse
this connection.
A connection is idle, if there is no request handled by it at the moment
(but it is connected).
Idle connections can get closed, depending on the connection pool
timeout (and connectTimeout on the Tomat side).
If second - that means the established connection is used for all requests of the
session => than a retry will not happen if during
the session the Tomcat causes Problems. (with recovery_options 7). - is this
right?
No, because the first "If" is wrong.
Version mod_jk 1.2.26 (upgraded recently)
Do your real world observations make a good fit with my explanations?
Regards,
Rainer
Here my worker.properties
worker.properties
worker.list=ajp_bam,ajp_ggi,ajp_ad,ajp_svp,.......,jkstatus
worker.template.type=ajp13
worker.template.lbfactor=5
worker.template.socket_keepalive=1
worker.template.connect_timeout=7000
worker.template.prepost_timeout=5000
worker.template.reply_timeout=180000
worker.template.retries=20
worker.template.activation=Active
worker.template.recovery_options=7
worker.lbtemplate.type=lb
worker.lbtemplate.max_reply_timeouts=6
worker.lbtemplate.method=Session
#Produktions Worker
# AS-INETP101 - 106 - 6/6 GGI
worker.INETP1011.host=AS-INETP101.AEAT.ALLIANZ.AT
worker.INETP1011.port=65001
worker.INETP1011.reference=worker.template
....many more of the same
then
worker.ajp_ad.reference=worker.lbtemplate
worker.ajp_ad.balance_workers=INETP1032,INETP1062
.... many more portals
at least jkstatus
The JKMount is very simple
JkMount /* ajp_ad --- for the other portals mostly the same
The Portals are Virtual Hosts on the Apache.
Tomcat - server.xml
example
<Connector port="65001" maxThreads="300" protocol="AJP/1.3" />
<Engine name="Catalina" jvmRoute="INETP5021"
defaultHost="default">
......
<Host name="slfinsol.com" appBase="webapps" unpackWARs="true"
autoDeploy="false" deployOnStartup="false" xmlValidation="false"
xmlNamespaceAware="false">
<Alias>www.slfinsol.com</Alias>
<Alias>web1.slfinsol.com</Alias>
...
<Alias>testweb.slfinsol.com</Alias>
.....
<Valve className="org.apache.catalina.valves.AccessLogValve"
directory="logs" prefix="swl_access_log." suffix=".txt"
pattern="common" resolveHosts="false" />
<Valve className="at.allianz.tomcat.valve.RequestTimeValve"/>
<Valve
className="at.allianz.tomcat.valve.WebcollaborationWorkaroundValve"/>
<Context path="" docBase="swl" />
<Context path="/monitor5" docBase="monitor" />
<Context path="/swl" docBase="swl" />
</Host>
thanxs for your time reading this and maybe giving tipps -
with kind regards
ahmed musa
---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]