Hi, we're seeing a strange problem here that is only partially reproducible.
Our customer is running a cluster of Tomcat 5.5.26 servers (several cluster domains) behind several load-balanced Apache-2.2.11 (for SSL termination + sticky sessions). The application consists of an unencrypted part and an SSL encrypted part. Most of the time, the setup is running fine (at least since we solved some (unrelated) network problems, see my previous mails). When a HTTP/1.0 client requests a dynamically generated page over SSL, most of the response is returned immediately. Then, we see a 5-second timeout (this is *not* Apache's KeepAliveTimeout), then the rest of the response is delivered just before the connection is shut down. For dynamically generated pages, we do not set a "Content-Length" header, so for HTTP/1.0 clients the server has to respond with "Connection: close" (which it does). Only it waits for 5 seconds before actually closing it. Everything works fine for - static content (where we set Content-Length) - redirects (where we set Content-Lenth: 0) - HTTP/1.1-clients (where the server uses Transfer-Encoding: Chunked) - HTTP/1.0-clients in the non-ssl part (!) Here's an example output generated by "curl -0 -v -L -N -o /dev/null": * About to connect() to xxx.yyy.de port 443 (#0) * Trying xxx.xxx.xx.xx... connected * Connected to xxx.yyy.de (xxx.xxx.xx.xx) port 443 (#0) * successfully set certificate verify locations: * CAfile: none CApath: /etc/ssl/certs/ * SSLv3, TLS handshake, Client hello (1): } [data not shown] * SSLv3, TLS handshake, Server hello (2): { [data not shown] * SSLv3, TLS handshake, CERT (11): { [data not shown] * SSLv3, TLS handshake, Server key exchange (12): { [data not shown] * SSLv3, TLS handshake, Server finished (14): { [data not shown] * SSLv3, TLS handshake, Client key exchange (16): } [data not shown] * SSLv3, TLS change cipher, Client hello (1): } [data not shown] * SSLv3, TLS handshake, Finished (20): } [data not shown] * SSLv3, TLS change cipher, Client hello (1): { [data not shown] * SSLv3, TLS handshake, Finished (20): { [data not shown] * SSL connection using DHE-RSA-AES256-SHA * Server certificate: * subject: /C=DE/ST=.../L=.../O=.../OU=.../CN=xxx.yyy.de * start date: 2009-07-13 00:00:00 GMT * expire date: 2010-07-23 23:59:59 GMT * common name: xxx.yyy.de (matched) * issuer: /O=VeriSign Trust Network/OU=VeriSign, Inc./OU=VeriSign International Server CA - Class 3/OU=www.verisign.com/CPS Incorp.by Ref. LIABILITY LTD. (c)97 VeriSign * SSL certificate verify ok. > GET /.../....html HTTP/1.0 > User-Agent: curl/7.19.0 (i686-suse-linux-gnu) libcurl/7.19.0 OpenSSL/0.9.8h > zlib/1.2.3 libidn/1.10 > Host: xxx.yyy.de > Accept: */* > Cookie: JSESSIONID=B0ED3118B70E8E00433E2E709C9FE5B7.zzz > < HTTP/1.1 200 OK < Date: Wed, 18 Nov 2009 15:18:50 GMT < Server: Apache < Cache-Control: no-cache < Pragma: no-cache < P3P: policyref="...", CP="IDC CUR DEV PSA CONi OUR DEL STP PHY ONL UNI PUR COM NAV DEM CNT STA" < Connection: close < Content-Type: text/html;charset=ISO-8859-1 < Content-Language: de < % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed ^M 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 { [data not shown] ^M100 24552 0 24552 0 0 19043 0 --:--:-- 0:00:01 --:--:-- 22239 ^M100 24552 0 24552 0 0 10706 0 --:--:-- 0:00:02 --:--:-- 11647 ^M100 24552 0 24552 0 0 7446 0 --:--:-- 0:00:03 --:--:-- 7889 ^M100 24552 0 24552 0 0 5702 0 --:--:-- 0:00:04 --:--:-- 5959 ^M100 24552 0 24552 0 0 4876 0 --:--:-- 0:00:05 --:--:-- 5062 * SSLv3, TLS alert, Client hello (1): { [data not shown] ^M100 28035 0 28035 0 0 5556 0 --:--:-- 0:00:05 --:--:-- 927 * Closing connection #0 * SSLv3, TLS alert, Client hello (1): } [data not shown] As you can see, 24552 (=3 * 8184) bytes are received almost immediately, while the rest is only transferred after 5 seconds. Leaving "-0" away from the curl command line, the complete result is received immediately. Requesting the same page via http instead of https, the complete result is received immediately. The 5-second-delay can be seen using wget instead of curl, too, so this is probably not a client problem. So far, the problem has only been seen on the production system. Due to the load conditions, it is infeasible to run mod_jk with significant logging output. mod_jk configuration is straightforward, timeouts are not defined (i. e. we use default values). Any ideas? Thanks, Peter -- Peter Conrad Tivano Software GmbH Bahnhofstr. 18 63263 Neu-Isenburg Tel: 06102 / 8099070 Fax: 06102 / 8099071 HRB 11680, AG Offenbach/Main Geschäftsführer: Martin Apel --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org