Hi,

We have the strangest problem started happening to us a few weeks ago
(after several years of running pretty much the same configuration).

1. The problem is only happening in the production environment.  We
cannot reproduce it on staging, which as far as we can tell is
configured identically to production.
2. The problem seems to be tied to the traffic volume and possible
pattern (Hence probably why we cannot reproduce it on staging).
3. Our configuration:  W2K server running IIS 6 and JK 1.2.27, Tomcat
v. 5.5.12 running on the same box.

The problem is as follows: some requests would result in multiple
copies of the first buffer-full of expected data, ended either by a
503 error page data, a 502 error page data, or the rest of the proper
page.  The number of copies is directly related to the JK's number of
retries setting.  With the number of retries initially being set to
10, the maximum number of repeated copies in the response was 20.
When we set the number of retries to 2, invalid replies contain only a
single buffer-full of the page's proper data followed by an error page
data.  On the Tomcat side, these multiple copies show up as multiple
request entries in the access log, while there is only one
corresponding request entry in the IIS log.

After a fresh restart of Tomcat, it takes a little while for this
problem to start manifesting itself.  With time, it is starting to
affect increasingly more requests until finally Tomcat gets entirely
locked up.  At that point Tomcat needs to be restarted.... lather,
rinse, repeat...

Here's what a sample of error messages in JK's log looks like:

[2760:2476] [error] jk_isapi_plugin.c (1199): WriteClient failed with
10053 (0x00002745)
[2760:4020] [error] jk_ajp_common.c (1726): Chunk length too large.
Length of AJP message is 8188, chunk length is 8192.
[2760:4020] [error] jk_ajp_common.c (2426): (default_1) connecting to
tomcat failed.
[2760:1104] [error] jk_ajp_common.c (1726): Chunk length too large.
Length of AJP message is 8188, chunk length is 8192.
[2760:1104] [error] jk_ajp_common.c (2426): (default_1) connecting to
tomcat failed.
[2760:1104] [error] jk_ajp_common.c (1726): Chunk length too large.
Length of AJP message is 8188, chunk length is 8192.
[2760:1104] [error] jk_ajp_common.c (2426): (default_1) connecting to
tomcat failed.
[2760:1104] [error] jk_lb_worker.c (1432): All tomcat instances
failed, no more workers left
[2760:1104] [error] jk_isapi_plugin.c (2199): service() failed with
http error 503
[2760:3876] [error] jk_ajp_common.c (1726): Chunk length too large.
Length of AJP message is 8188, chunk length is 8192.
[2760:3876] [error] jk_ajp_common.c (2426): (default_1) connecting to
tomcat failed.
[2760:3232] [error] jk_ajp_common.c (1726): Chunk length too large.
Length of AJP message is 8188, chunk length is 8192.
[2760:3232] [error] jk_ajp_common.c (2426): (default_1) connecting to
tomcat failed.
[2760:3232] [error] jk_ajp_common.c (1726): Chunk length too large.
Length of AJP message is 8188, chunk length is 8192.
[2760:3232] [error] jk_ajp_common.c (2426): (default_1) connecting to
tomcat failed.
[2760:3232] [error] jk_lb_worker.c (1432): All tomcat instances
failed, no more workers left
[2760:3232] [error] jk_isapi_plugin.c (2199): service() failed with
http error 503

The Chunk length messages have been in our logs forever.  Yesterday I
temporarily changed JK & Tomcat configuration matching the packet
sizes.  The chunk errors went away, but the problem seemed to persist,
so I put everything back the way it was.

To me this look likes some weird error condition in Tomcat has hit an
obscure bug in JK whereby it doesn't clear the response buffer between
retries.  Has anyone encountered this issue before or is just willing
to land a helping hand in troubleshooting?


Thanks
Dmitry

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to