See Thread at: http://www.techienuggets.com/Detail?tx=25608 Posted on behalf of a User
Hallo to all, After long unsuccessful research i hope someone can give me a hint to the following problems. Our Apache-mod_jk-Tomcat Infrastructur was running without Problems for about one year-than since two month mod_jk errors occurs. We upgraded the mod_jk Version, made improvements in the worker.properties - the problems changed and get less but sometimes they appear further on. It seems that the mod_jk worker loose the connection to their Tomcat-Backendserver - there are messages in the mod_jk log Files which points in this direction. Normally this seems not to be a big problem - but under certain conditions (which ?) the worker goes to an error state and cannot recover itself- must be done manually. Problem 1: The Tomcats are reachable - unknown why the workers think the server is dead ? Problem 2: I have no idea why the worker goes to an error state and cannot recover. Problem3: I miss explanations of logged messages - i read the messages - but cannot match them to the situation - when does a worker post this messages [Wed Feb 20 10:04:01.889 2008] [19237:3086010048] [info] jk_handler::mod_jk.c (2270): Aborting connection for worker=ajp_ggi [Wed Feb 20 10:04:39.799 2008] [19294:3086010048] [error] ajp_get_reply::jk_ajp_common.c (1623): (INETP1011) Timeout with waiting reply from tomcat. Tomcat is down, stopped or network problems (errno=110) [Wed Feb 20 10:04:39.799 2008] [19294:3086010048] [error] ajp_service::jk_ajp_common.c (2034): (INETP1011) receiving reply from tomcat failed with out recovery in send loop attempt=0 [Wed Feb 20 10:04:41.799 2008] [19294:3086010048] [error] service::jk_lb_worker.c (1105): unrecoverable error 504, request failed. Tomcat failed in the middle of request, we can't recover to another instance. -> Which Timeout - how does mod_jk think Tomcat is down ? Where can i found details to errno=110 ?... -> receiving reply from tomcat failed with out recovery in send loop attempt=0 - ? with out recovery in send loop - means? -> unrecoverable error 504 - details to this error ? Ok - i turn the logging level to debug - the course of events get more clear - but also more questions appear - there are socket numbers - which sockets - what are these numbers e.g will be shutting down socket 35 for worker INETP1021 - The sockets are good for ? - how many are there/per worker ? can i configure them ? => Generally -How can i solve such problems - i tried to look into the mod_jk code - searching for error codes, error messages - but cannot find some relevant informations, - i am studying the log Files - but don't find out what really happens. So - maybe someone has an idea why the worker think that the corresponding Tomcat is dead, and why he will not recover by itself. ! And i am also searching for tips how i can help myself - and where to find something about the error codes, messages,..in mod_jk thanks for your attention Best ahmed musa (writing from vienna) Current Infrastructur We have 3 Apache Webserver (2.2.6) -based on CentOS release 4.3 /Kernelversion 2.6.9-34 In front of the Webserver there are two (two Locations) HW-Loadbalancer (but they have no role in this story) The Webservers are hosted at our ISP. The Webserver balance the requests via mod_jk (Version 1.2.25) for approx. 10 Webapps to 18 Backend-Tomcatserver (Bladeserver - because of underlying Application-Parts the OS is Windows 2003 Server - a long story not worth to explain :-) ). The Tomcatserver gain Data via Requests against DB2 Server/DB2-Databases on the Mainframe. The Tomcatserver are Inhouse -and were rebooted nightly because of automated Deployment processes. Between the Webserver and the Tomcatserver is a Checkpoint Firewall. All webapps are deployed on all Tomcats - only mod_jk manages the requests to certain Tomcat- instances. (on one Bladeserver there are two identically Tomcat Instances running). Versions: Tomcat - 5.5.17_11, JDK 1.5.0_11-b03. The requests against the public Website(s) are normal short living requests - not many - The most Webapps (Portals) need a login, have a strong focus on business logic - so the instances are big (many MBs in RAM), the sessions are sticky and the session timeout is 20 minutes. But there are also less requests. To the User requests - Monitoring requests from our ISP are added. The Problems appears at Servers/Portals which very less Userrequests. worker.properties worker.list=ajp_bam,ajp_ggi,ajp_ad,ajp_svp,.......,jkstatus worker.template.type=ajp13 worker.template.lbfactor=5 worker.template.socket_keepalive=1 worker.template.connect_timeout=7000 worker.template.prepost_timeout=5000 worker.template.reply_timeout=120000 worker.template.retries=6 worker.template.activation=Active worker.template.recovery_options=7 worker.lbtemplate.type=lb worker.lbtemplate.max_reply_timeouts=6 worker.lbtemplate.method=Session #Produktions Worker # AS-INETP101 - 106 - 6/6 GGI worker.INETP1011.host=AS-INETP101.AEAT.ALLIANZ.AT worker.INETP1011.port=65001 worker.INETP1011.reference=worker.template ....many more of the same then worker.ajp_ad.reference=worker.lbtemplate worker.ajp_ad.balance_workers=INETP1032,INETP1062 .... many more portals at least jkstatus The JKMount is very simple JkMount /* ajp_ad --- for the other portals mostly the same The Portals are Virtual Hosts on the Apache. Tomcat - server.xml example <Connector port="65001" maxThreads="300" protocol="AJP/1.3" /> <Engine name="Catalina" jvmRoute="INETP5021" defaultHost="default"> ...... <Host name="slfinsol.com" appBase="webapps" unpackWARs="true" autoDeploy="false" deployOnStartup="false" xmlValidation="false" xmlNamespaceAware="false"> <Alias>www.slfinsol.com</Alias> <Alias>web1.slfinsol.com</Alias> ... <Alias>testweb.slfinsol.com</Alias> ..... <Valve className="org.apache.catalina.valves.AccessLogValve" directory="logs" prefix="swl_access_log." suffix=".txt" pattern="common" resolveHosts="false" /> <Valve className="at.allianz.tomcat.valve.RequestTimeValve"/> <Valve className="at.allianz.tomcat.valve.WebcollaborationWorkaroundValve"/> <Context path="" docBase="swl" /> <Context path="/monitor5" docBase="monitor" /> <Context path="/swl" docBase="swl" /> </Host> --------------------------------------------------------------------- To start a new topic, e-mail: users@tomcat.apache.org To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]