On 01.05.2009 18:19, Dmitry Beransky wrote:
> We have the strangest problem started happening to us a few weeks ago
> (after several years of running pretty much the same configuration).
> 
> 1. The problem is only happening in the production environment.  We
> cannot reproduce it on staging, which as far as we can tell is
> configured identically to production.
> 2. The problem seems to be tied to the traffic volume and possible
> pattern (Hence probably why we cannot reproduce it on staging).
> 3. Our configuration:  W2K server running IIS 6 and JK 1.2.27, Tomcat
> v. 5.5.12 running on the same box.
> 
> The problem is as follows: some requests would result in multiple
> copies of the first buffer-full of expected data, ended either by a
> 503 error page data, a 502 error page data, or the rest of the proper
> page.  The number of copies is directly related to the JK's number of
> retries setting.  With the number of retries initially being set to
> 10, the maximum number of repeated copies in the response was 20.
> When we set the number of retries to 2, invalid replies contain only a
> single buffer-full of the page's proper data followed by an error page
> data.  On the Tomcat side, these multiple copies show up as multiple
> request entries in the access log, while there is only one
> corresponding request entry in the IIS log.
> 
> After a fresh restart of Tomcat, it takes a little while for this
> problem to start manifesting itself.  With time, it is starting to
> affect increasingly more requests until finally Tomcat gets entirely
> locked up.  At that point Tomcat needs to be restarted.... lather,
> rinse, repeat...
> 
> Here's what a sample of error messages in JK's log looks like:
> 
> [2760:2476] [error] jk_isapi_plugin.c (1199): WriteClient failed with
> 10053 (0x00002745)
> [2760:4020] [error] jk_ajp_common.c (1726): Chunk length too large.
> Length of AJP message is 8188, chunk length is 8192.
> [2760:4020] [error] jk_ajp_common.c (2426): (default_1) connecting to
> tomcat failed.
> [2760:1104] [error] jk_ajp_common.c (1726): Chunk length too large.
> Length of AJP message is 8188, chunk length is 8192.
> [2760:1104] [error] jk_ajp_common.c (2426): (default_1) connecting to
> tomcat failed.
> [2760:1104] [error] jk_ajp_common.c (1726): Chunk length too large.
> Length of AJP message is 8188, chunk length is 8192.
> [2760:1104] [error] jk_ajp_common.c (2426): (default_1) connecting to
> tomcat failed.
> [2760:1104] [error] jk_lb_worker.c (1432): All tomcat instances
> failed, no more workers left
> [2760:1104] [error] jk_isapi_plugin.c (2199): service() failed with
> http error 503
> [2760:3876] [error] jk_ajp_common.c (1726): Chunk length too large.
> Length of AJP message is 8188, chunk length is 8192.
> [2760:3876] [error] jk_ajp_common.c (2426): (default_1) connecting to
> tomcat failed.
> [2760:3232] [error] jk_ajp_common.c (1726): Chunk length too large.
> Length of AJP message is 8188, chunk length is 8192.
> [2760:3232] [error] jk_ajp_common.c (2426): (default_1) connecting to
> tomcat failed.
> [2760:3232] [error] jk_ajp_common.c (1726): Chunk length too large.
> Length of AJP message is 8188, chunk length is 8192.
> [2760:3232] [error] jk_ajp_common.c (2426): (default_1) connecting to
> tomcat failed.
> [2760:3232] [error] jk_lb_worker.c (1432): All tomcat instances
> failed, no more workers left
> [2760:3232] [error] jk_isapi_plugin.c (2199): service() failed with
> http error 503
> 
> The Chunk length messages have been in our logs forever.  Yesterday I
> temporarily changed JK & Tomcat configuration matching the packet
> sizes.  The chunk errors went away, but the problem seemed to persist,
> so I put everything back the way it was.

The chunk length message seems pretty weird. Looks like a protocol
corruption. Those indicate, that you should really try a TC update.
Concerining your restriction "can't update before any other options are
exhausted": there will never be any other options exhausted. But after
some options are taken, the rest get more and more expensive, risky and
with a low chance of success.

> To me this look likes some weird error condition in Tomcat has hit an
> obscure bug in JK whereby it doesn't clear the response buffer between
> retries.  Has anyone encountered this issue before or is just willing
> to land a helping hand in troubleshooting?

Not encountered this before, and I think noone reported a similar
observation. Concerning "retries": Could you provide your full
configuration (e.g. retries for an ajp13 worker is something very
different from retries of a load balancer worker).

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to