Hi, We have the strangest problem started happening to us a few weeks ago (after several years of running pretty much the same configuration).
1. The problem is only happening in the production environment. We cannot reproduce it on staging, which as far as we can tell is configured identically to production. 2. The problem seems to be tied to the traffic volume and possible pattern (Hence probably why we cannot reproduce it on staging). 3. Our configuration: W2K server running IIS 6 and JK 1.2.27, Tomcat v. 5.5.12 running on the same box. The problem is as follows: some requests would result in multiple copies of the first buffer-full of expected data, ended either by a 503 error page data, a 502 error page data, or the rest of the proper page. The number of copies is directly related to the JK's number of retries setting. With the number of retries initially being set to 10, the maximum number of repeated copies in the response was 20. When we set the number of retries to 2, invalid replies contain only a single buffer-full of the page's proper data followed by an error page data. On the Tomcat side, these multiple copies show up as multiple request entries in the access log, while there is only one corresponding request entry in the IIS log. After a fresh restart of Tomcat, it takes a little while for this problem to start manifesting itself. With time, it is starting to affect increasingly more requests until finally Tomcat gets entirely locked up. At that point Tomcat needs to be restarted.... lather, rinse, repeat... Here's what a sample of error messages in JK's log looks like: [2760:2476] [error] jk_isapi_plugin.c (1199): WriteClient failed with 10053 (0x00002745) [2760:4020] [error] jk_ajp_common.c (1726): Chunk length too large. Length of AJP message is 8188, chunk length is 8192. [2760:4020] [error] jk_ajp_common.c (2426): (default_1) connecting to tomcat failed. [2760:1104] [error] jk_ajp_common.c (1726): Chunk length too large. Length of AJP message is 8188, chunk length is 8192. [2760:1104] [error] jk_ajp_common.c (2426): (default_1) connecting to tomcat failed. [2760:1104] [error] jk_ajp_common.c (1726): Chunk length too large. Length of AJP message is 8188, chunk length is 8192. [2760:1104] [error] jk_ajp_common.c (2426): (default_1) connecting to tomcat failed. [2760:1104] [error] jk_lb_worker.c (1432): All tomcat instances failed, no more workers left [2760:1104] [error] jk_isapi_plugin.c (2199): service() failed with http error 503 [2760:3876] [error] jk_ajp_common.c (1726): Chunk length too large. Length of AJP message is 8188, chunk length is 8192. [2760:3876] [error] jk_ajp_common.c (2426): (default_1) connecting to tomcat failed. [2760:3232] [error] jk_ajp_common.c (1726): Chunk length too large. Length of AJP message is 8188, chunk length is 8192. [2760:3232] [error] jk_ajp_common.c (2426): (default_1) connecting to tomcat failed. [2760:3232] [error] jk_ajp_common.c (1726): Chunk length too large. Length of AJP message is 8188, chunk length is 8192. [2760:3232] [error] jk_ajp_common.c (2426): (default_1) connecting to tomcat failed. [2760:3232] [error] jk_lb_worker.c (1432): All tomcat instances failed, no more workers left [2760:3232] [error] jk_isapi_plugin.c (2199): service() failed with http error 503 The Chunk length messages have been in our logs forever. Yesterday I temporarily changed JK & Tomcat configuration matching the packet sizes. The chunk errors went away, but the problem seemed to persist, so I put everything back the way it was. To me this look likes some weird error condition in Tomcat has hit an obscure bug in JK whereby it doesn't clear the response buffer between retries. Has anyone encountered this issue before or is just willing to land a helping hand in troubleshooting? Thanks Dmitry --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org