Hi Tim,
[EMAIL PROTECTED] wrote:
Hi Rainer,
Thanks for the response. To cover a few points you made -
- Yes, I had a hunch long running requests are a problem; because of
our appliction design, some pages invoked for the first time take a
while (we can't cache them all!). Is there an easy way to correlate
(apart from timestamp) the errors in the isapi and the requests made
to IIS ? I mean can I get isapi log to show the URL being processed?
I'm afraid the answer most liekely is "no". You can file an enhancement
request in our bugzilla. That way the feature might materialize one day.
For Apache httpd, correlation between error an log message is a little
better. Even though we don't include the URL, we include the PID and
thread ID, and both can be logged in the httpd access log. So having
time, process id and thread id, usually makes it possible to correlate
successful, although it is still some work and not perfect.
- I've now got the %D option in place on Tomcat to figure out from
tomorrow which are the heavy pages - Yes, thread dumps on JDK 1.3 &
TC4.1.x are tricky - I'm looking at the Tomcat JavaWrapper approach
as a way forward. The version of the 3rd party product we have in
place is only supported on jdk1.3 (this is a pretty ancient set up!)
- I agree that my incident traffic load is not huge, and should be
supportable by the environment in place. - I'll try a load balancer
worker to see if that tells me more info
At least it does tell you the number of errors a worker had, and also
the number of client errors. That way you can check quickly, if there
are more errors than client errors, and by pollling the values, you can
find out how often and during which times things are happening. Only
counters though, no per incodent information.
The page can be configured to return machine readable content. Have al
look at:
http://tomcat.apache.org/connectors-doc/reference/status.html
If possible I'll have some more information in a day or so from
this...
cheers
Tim
Regards,
Rainer
________________________________
From: Rainer Jung [mailto:[EMAIL PROTECTED] Sent: Mon
10/12/2007 12:11 To: Tomcat Users List Subject: Re: ISAPI JK2 ran
better than JK, how can that be?
Hi Tim,
[EMAIL PROTECTED] wrote:
OK,
So our website keeps crashing over the past couple of weeks (usual
story on this list eh?)
Not really (although a users list is always focused on problems and
not on the working side of things ...)
We've been running JK isapi plugin v1.2.15 for a fair while, but
the isapi redirector log always contains huge numbers of errors
being thrown (see snippet below). We were getting a complete
failure of IIS to serve traffic, solved only by a restart of IIS
and Tomcat.
Very recently we moved up to v1.2.25 in the hope of improving
performance but it seems to have little effect- we're still getting
high numbers of 503 responses sent back (maybe 5+ per minute). We
do however now serve static resources from IIS to reduce the use of
the ISAPI calls where possible.
The error number 2746 is hex for 10054, which is a connection reset
by peer winsock error. peer in this case is your IIS client (browser
etc.).
Often this is caused by long running requests, where the users press
the retry/again button. Then the browser immediately closes the
connection and uses a new one for the same request. When you are
sending back the response later, the closed connection gets detected
and logged.
You should configure logging of response durations to find out, if
maybe you've got a problem with long running requests. You can do
that using the JK request logging, or with the Tomcat access log (add
format %D to your pattern, which is duration milliseconds).
Usually this does *not* mean, that restarting IIS or Tomcat will
help.
Concerning Tomcat: you should do a couple of thread dumps before
restarting it. That way you can find out, if lots of requests got
stuck inside the container, and if so, what they are actually doing
or waiting for.
Concerning IIS: does "netstat -an" look fine, once you think you need
to restart?
But here's the kicker: - previously this year we were still using a
JK2 isapi_redirector2.dll, and that seemed to be serving
comparable traffic rates with fewer errors (certainly no complete
failures). No hard data to support this yet, just my recollection
of serious outages over the past couple of years.
I think, we should make a distinction between the number of log
messages (here we simply might be more detailed with JK) and serious
problems, like the container no longer responding, or responding to
slowly.
AWStats on our log files suggests our incident traffic is ~7
million pages per month, peaking at lunchtime & early evening at
perhaps 3-5 reqs/sec.
That's not a lot of traffic. What are average response times? Is it
usual webapp load, or very special use cases, like long running
uploads or downloads?
Scaling to multiple tomcats is not an option right now due to 3rd
party license costs in the webapp (its a CMS system).
The request numbers seem not to support scaling horicontally like an
option, that you should consoder already (except request handling is
very CPU intensive, or you need a lot of memory, or ...).
Our environment: Java 1.3.1 Tomcat 4.1.18 IIS v5 IIS & Tomcat are
co-located on same server (4GB RAM, win2k o/s)
Ooops. I'm not really sure about the behaviour of 1.3 fopr thread
dumps. It's fine for 1.4.2, but you should test in a stagi8ng or dev
system, what happens with 1.3. Consider updateing to 4.1.36 and if
possible 1.4.2_some_recent_patch_level.
Questions:
- Are there obvious worker directives that would help the issue
further ? - In the list archives I've seen conflicting views on
what to set connectionTimeout to be in the tomcat and worker
config. Some say 0, some say 600 secs. Which tends to be more
useful? - All of the 00002745 errors - do they indicate a network
problem upstream of the server? - When viewing the jkstatus page,
the worker only shows type, host, address. I was expecting further
data as listed in the legend. Am I missing something?
2746: see above. I would not expect any worker setting to help in
case the root cause are really long running requests. Then you would
really have to log request duration and do a couple of thread dumps,
to find out, which requests are running to long for which reason.
jkstatus: add a load balancer worker to your ajp13 worker and use the
load balancer as the worker you map. The load balancer does a lot of
statistics and shows all the detailed information in jkstatus.
Because of its managability a load balancer is interesting, even if
you have only one backend.
isapi log:
[Fri Dec 07 03:35:16 2007] [error] jk_isapi_plugin.c (639):
WriteClient failed with 00002745 [Fri Dec 07 03:35:16 2007] [info]
jk_ajp_common.c (1384): Connection aborted or network problems [Fri
Dec 07 03:35:16 2007] [info] jk_ajp_common.c (1731): Receiving
from tomcat failed, because of client error without recovery in
send loop 0 [Fri Dec 07 03:35:17 2007] [error] jk_isapi_plugin.c
(639): WriteClient failed with 00002746 [Fri Dec 07 03:35:17 2007]
[error] jk_isapi_plugin.c (639): WriteClient failed with 00002746
[Fri Dec 07 03:35:17 2007] [info] jk_ajp_common.c (1384):
Connection aborted or network problems [Fri Dec 07 03:35:17 2007]
[info] jk_ajp_common.c (1384): Connection aborted or network
problems [Fri Dec 07 03:35:17 2007] [info] jk_ajp_common.c (1731):
Receiving from tomcat failed, because of client error without
recovery in send loop 0 [Fri Dec 07 03:35:17 2007] [error]
jk_isapi_plugin.c (639): WriteClient failed with 00002746 [Fri Dec
07 03:35:17 2007] [info] jk_ajp_common.c (1731): Receiving from
tomcat failed, because of client error without recovery in send
loop 0 [Fri Dec 07 03:35:17 2007] [error] jk_isapi_plugin.c (639):
WriteClient failed with 00002746 [Fri Dec 07 03:35:17 2007] [info]
jk_ajp_common.c (1384): Connection aborted or network problems [Fri
Dec 07 03:35:17 2007] [info] jk_ajp_common.c (1384): Connection
aborted or network problems [Fri Dec 07 03:35:17 2007] [info]
jk_ajp_common.c (1731): Receiving from tomcat failed, because of
client error without recovery in send loop 0 [Fri Dec 07 03:35:17
2007] [info] jk_ajp_common.c (1731): Receiving from tomcat failed,
because of client error without recovery in send loop 0 [Fri Dec 07
03:35:17 2007] [error] jk_isapi_plugin.c (639): WriteClient failed
with 00002746 [Fri Dec 07 03:35:17 2007] [error] jk_isapi_plugin.c
(639): WriteClient failed with 00002746 [Fri Dec 07 03:35:17 2007]
[info] jk_ajp_common.c (1384): Connection aborted or network
problems
Out Tomcat connector config -
<Connector className="org.apache.coyote.tomcat4.CoyoteConnector"
redirectPort="8443" bufferSize="2048" port="8009"
connectionTimeout="300000" scheme="http" enableLookups="false"
secure="false"
protocolHandlerClassName="org.apache.jk.server.JkCoyoteHandler"
debug="0" disableUploadTimeout="false" proxyPort="0"
maxProcessors="200" minProcessors="2" tcpNoDelay="true"
acceptCount="20" useURIValidationHack="false"> <Factory
className="org.apache.catalina.net.DefaultServerSocketFactory"/>
</Connector>
worker.properties -
worker.website.type=ajp13 worker.website.host=localhost
worker.website.port=8009 # 200 concurrent users
worker.website.connection_pool_size=200
worker.website.connection_pool_timeout=300
Regards,
Rainer
---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]