Hi Tim,

[EMAIL PROTECTED] wrote:
Hi  Rainer,

Thanks for the response. To cover a few points you made -

- Yes, I had a hunch long running requests are a problem; because of
our appliction design, some pages invoked for the first time take a
while (we can't cache them all!).  Is there an easy way to correlate
(apart from timestamp) the errors in the isapi and the requests made
to IIS ?  I mean can I get isapi log to show the URL being processed?

I'm afraid the answer most liekely is "no". You can file an enhancement request in our bugzilla. That way the feature might materialize one day.

For Apache httpd, correlation between error an log message is a little better. Even though we don't include the URL, we include the PID and thread ID, and both can be logged in the httpd access log. So having time, process id and thread id, usually makes it possible to correlate successful, although it is still some work and not perfect.

 - I've now got the %D option in place on Tomcat to figure out from
tomorrow which are the heavy pages - Yes, thread dumps on JDK 1.3 &
TC4.1.x are tricky - I'm looking at the Tomcat JavaWrapper approach
as a way forward. The version of the 3rd party product we have in
place is only supported on jdk1.3  (this is a pretty ancient set up!)
 - I agree that my incident traffic load is not huge, and should be
supportable by the environment in place. - I'll try a load balancer
worker to see if that tells me more info

At least it does tell you the number of errors a worker had, and also the number of client errors. That way you can check quickly, if there are more errors than client errors, and by pollling the values, you can find out how often and during which times things are happening. Only counters though, no per incodent information.

The page can be configured to return machine readable content. Have al look at:

http://tomcat.apache.org/connectors-doc/reference/status.html

If possible I'll have some more information in a day or so from
this...


cheers


Tim

Regards,

Rainer


________________________________

From: Rainer Jung [mailto:[EMAIL PROTECTED] Sent: Mon
10/12/2007 12:11 To: Tomcat Users List Subject: Re: ISAPI JK2 ran
better than JK, how can that be?



Hi Tim,

[EMAIL PROTECTED] wrote:
OK,

So our website keeps crashing over the past couple of weeks (usual story on this list eh?)

Not really (although a users list is always focused on problems and
not on the working side of things ...)

We've been running JK isapi plugin v1.2.15 for a fair while, but
the isapi redirector log always contains huge numbers of errors
being thrown (see snippet below). We were getting a complete
failure of IIS to serve traffic, solved only by a restart of IIS
and Tomcat.

Very recently we moved up to v1.2.25 in the hope of improving performance but it seems to have little effect- we're still getting
 high numbers of 503 responses sent back (maybe 5+ per minute). We
do however now serve static resources from IIS to reduce the use of
the ISAPI calls where possible.

The error number 2746 is hex for 10054, which is a connection reset
by peer winsock error. peer in this case is your IIS client (browser
etc.).

Often this is caused by long running requests, where the users press
the retry/again button. Then the browser immediately closes the
connection and uses a new one for the same request. When you are
sending back the response later, the closed connection gets detected
and logged.

You should configure logging of response durations to find out, if
maybe you've got a problem with long running requests. You can do
that using the JK request logging, or with the Tomcat access log (add
format %D to your pattern, which is duration milliseconds).

Usually this does *not* mean, that restarting IIS or Tomcat will
help.

Concerning Tomcat: you should do a couple of thread dumps before restarting it. That way you can find out, if lots of requests got
stuck inside the container, and if so, what they are actually doing
or waiting for.

Concerning IIS: does "netstat -an" look fine, once you think you need
to restart?

But here's the kicker: - previously this year we were still using a
 JK2 isapi_redirector2.dll, and that seemed to be serving
comparable traffic rates with fewer errors (certainly no complete
failures). No hard data to support this yet, just my recollection
of serious outages over the past couple of years.

I think, we should make a distinction between the number of log
messages (here we simply might be more detailed with JK) and serious
problems, like the container no longer responding, or responding to
slowly.

AWStats on our log files suggests our incident traffic is ~7
million pages per month, peaking at lunchtime & early evening at
perhaps 3-5 reqs/sec.

That's not a lot of traffic. What are average response times? Is it usual webapp load, or very special use cases, like long running
uploads or downloads?

Scaling to multiple tomcats is not an option right now due to 3rd party license costs in the webapp (its a CMS system).

The request numbers seem not to support scaling horicontally like an option, that you should consoder already (except request handling is very CPU intensive, or you need a lot of memory, or ...).

Our environment: Java 1.3.1 Tomcat 4.1.18 IIS v5 IIS & Tomcat are co-located on same server (4GB RAM, win2k o/s)

Ooops. I'm not really sure about the behaviour of 1.3 fopr thread
dumps. It's fine for 1.4.2, but you should test in a stagi8ng or dev
system, what happens with 1.3. Consider updateing to 4.1.36 and if
possible 1.4.2_some_recent_patch_level.

Questions:

- Are there obvious worker directives that would help the issue further ? - In the list archives I've seen conflicting views on
what to set connectionTimeout to be in the tomcat and worker
config. Some say 0, some say 600 secs. Which tends to be more
useful? - All of the 00002745 errors - do they indicate a network
problem upstream of the server? - When viewing the jkstatus page,
the worker only shows type, host, address. I was expecting further
data as listed in the legend. Am I missing something?

2746: see above. I would not expect any worker setting to help in
case the root cause are really long running requests. Then you would
really have to log request duration and do a couple of thread dumps,
to find out, which requests are running to long for which reason.

jkstatus: add a load balancer worker to your ajp13 worker and use the
 load balancer as the worker you map. The load balancer does a lot of
 statistics and shows all the detailed information in jkstatus.
Because of its managability a load balancer is interesting, even if
you have only one backend.

isapi log:

[Fri Dec 07 03:35:16 2007] [error] jk_isapi_plugin.c (639): WriteClient failed with 00002745 [Fri Dec 07 03:35:16 2007] [info] jk_ajp_common.c (1384): Connection aborted or network problems [Fri
 Dec 07 03:35:16 2007] [info]  jk_ajp_common.c (1731): Receiving
from tomcat failed, because of client error without recovery in
send loop 0 [Fri Dec 07 03:35:17 2007] [error] jk_isapi_plugin.c
(639): WriteClient failed with 00002746 [Fri Dec 07 03:35:17 2007]
[error] jk_isapi_plugin.c (639): WriteClient failed with 00002746
[Fri Dec 07 03:35:17 2007] [info]  jk_ajp_common.c (1384):
Connection aborted or network problems [Fri Dec 07 03:35:17 2007]
[info]  jk_ajp_common.c (1384): Connection aborted or network
problems [Fri Dec 07 03:35:17 2007] [info]  jk_ajp_common.c (1731):
Receiving from tomcat failed, because of client error without
recovery in send loop 0 [Fri Dec 07 03:35:17 2007] [error]
jk_isapi_plugin.c (639): WriteClient failed with 00002746 [Fri Dec
07 03:35:17 2007] [info]  jk_ajp_common.c (1731): Receiving from
tomcat failed, because of client error without recovery in send
loop 0 [Fri Dec 07 03:35:17 2007] [error] jk_isapi_plugin.c (639):
WriteClient failed with 00002746 [Fri Dec 07 03:35:17 2007] [info]
jk_ajp_common.c (1384): Connection aborted or network problems [Fri
Dec 07 03:35:17 2007] [info]  jk_ajp_common.c (1384): Connection
aborted or network problems [Fri Dec 07 03:35:17 2007] [info]
jk_ajp_common.c (1731): Receiving from tomcat failed, because of
client error without recovery in send loop 0 [Fri Dec 07 03:35:17
2007] [info]  jk_ajp_common.c (1731): Receiving from tomcat failed,
because of client error without recovery in send loop 0 [Fri Dec 07
03:35:17 2007] [error] jk_isapi_plugin.c (639): WriteClient failed
with 00002746 [Fri Dec 07 03:35:17 2007] [error] jk_isapi_plugin.c
(639): WriteClient failed with 00002746 [Fri Dec 07 03:35:17 2007]
[info]  jk_ajp_common.c (1384): Connection aborted or network
problems

Out Tomcat connector config -

<Connector className="org.apache.coyote.tomcat4.CoyoteConnector" redirectPort="8443" bufferSize="2048" port="8009" connectionTimeout="300000" scheme="http" enableLookups="false" secure="false" protocolHandlerClassName="org.apache.jk.server.JkCoyoteHandler" debug="0" disableUploadTimeout="false" proxyPort="0" maxProcessors="200" minProcessors="2" tcpNoDelay="true" acceptCount="20" useURIValidationHack="false"> <Factory className="org.apache.catalina.net.DefaultServerSocketFactory"/> </Connector>

worker.properties -

worker.website.type=ajp13 worker.website.host=localhost worker.website.port=8009 # 200 concurrent users worker.website.connection_pool_size=200 worker.website.connection_pool_timeout=300

Regards,

Rainer

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to