Hi Rainer, I will do my best to provide those things.

Here is what looks like the full sequence from the our log:

[46055:3512666992] [info] jk_open_socket::jk_connect.c (627): connect to
_ip_:12409 failed (errno=115)
[46055:3512666992] [info] ajp_connect_to_endpoint::jk_ajp_common.c (992):
Failed opening socket to (_ip_:12409) (errno=115)
[46055:3512666992] [error] ajp_send_request::jk_ajp_common.c (1621):
(_hostname_) connecting to backend failed. Tomcat is probably not started
or is listening on the wrong port (errno=115)
[46055:3512666992] [info] ajp_service::jk_ajp_common.c (2614): (_hostname_)
sending request to tomcat failed (recoverable), because of error during
request sending (attempt=1)
[46055:3512666992] [info] jk_open_socket::jk_connect.c (627): connect to
_ip_:12409 failed (errno=115)
[46055:3512666992] [info] ajp_connect_to_endpoint::jk_ajp_common.c (992):
Failed opening socket to (_ip_:12409) (errno=115)
[46055:3512666992] [error] ajp_send_request::jk_ajp_common.c (1621):
(_hostname_) connecting to backend failed. Tomcat is probably not started
or is listening on the wrong port (errno=115)
[46055:3512666992] [info] ajp_service::jk_ajp_common.c (2614): (_hostname_)
sending request to tomcat failed (recoverable), because of error during
request sending (attempt=2)
[46055:3512666992] [error] ajp_service::jk_ajp_common.c (2634):
(_hostname_) connecting to tomcat failed.
[46055:3512666992] [info] service::jk_lb_worker.c (1469): service failed,
worker _hostname_ is in error state

You can see after this sequence the backend worker is marked as Bad.

Here is the config:

JkWorkerProperty worker.list=jkstatus,ajp_app,ajp_app2,ajp_app3,...
JkWorkerProperty worker.jkstatus.type=status
JkWorkerProperty worker.lb_member_template.type=ajp13
JkWorkerProperty worker.lb_member_template.activation=Active
JkWorkerProperty worker.lb_member_template.ping_mode=A
JkWorkerProperty worker.lb_member_template.connection_pool_timeout=600
JkWorkerProperty worker.lb_member_template.socket_keepalive=True
JkWorkerProperty worker.lb_member_template.socket_timeout=30
JkWorkerProperty worker.lb_member_template.socket_connect_timeout=3000
JkWorkerProperty worker.lb_member_template.recover_time=30
JkWorkerProperty worker.lb_member_template.recovery_options=7
JkWorkerProperty worker.lb_worker_template.type=lb
JkWorkerProperty worker.ajp_app.reference=worker.lb_worker_template
JkWorkerProperty worker.ajp_app.balance_workers=_hostname1_ajpport1,
_hostname1_ajpport2, ..., _hostname34_ajpport15
JkWorkerProperty
worker._hostname_ajpportX.reference=worker.lb_member_template
JkWorkerProperty worker._hostname_ajpportX.host=_hostname_
JkWorkerProperty worker._hostname_ajpportX.port=xxxx

will this list accept attachments for the other details such as netstat
output and thread dumps?


Thanks,
Max L


On Fri, Mar 4, 2016 at 1:22 PM, Rainer Jung <rainer.j...@kippdata.de> wrote:

> Am 04.03.2016 um 20:35 schrieb Max Lynch:
>
>> Hi there,
>>
>> We have a very heavily used implementation of modjk 1.2.35 running in
>> Apache 2.2.15 i686 on CentOS 6.7 x86_64. After Apache startup, our system
>> will perform optimally with no errors for about 24 hours, after which we
>> begin to see this message:
>>
>> connecting to backend failed. Tomcat is probably not started or is
>> listening on the wrong port (errno=115)
>>
>
> Errno 115 on RHEL is EINPROGRESS. That means the call didn't finish but
> one could retry it. This indicates we might be able to improve the code,
> but it is also possible that e.g. a configured socket_connect_timeout was
> reached. To check, we would need the full mod_jk log lines (and if several
> different log lines show up for one event all of them) including the
> columns with source file name and line number etc.
>
> It would also be very useful to see your configuration. You can remove IP
> adresses, ports, secrets etc. and rename your workers but we should see the
> timeout setting, cping settings and so on.
>
> Once we start seeing this error for one backend/worker, we begin seeing the
>> same errors for eventually all workers. This problem doesn't go away until
>> we restart Apache. Our setup consists of 720 workers per apache, with
>> multiple apache servers, and each apache server also has several other
>> sites configured with modjk serving other tomcat backends. It should be
>> noted that we do not see the same error with other sites, nor do we have
>> so
>> many workers defined for any other site.
>>
>
> We've searched through past mailings to try and find the same issue. The
>> couple times we saw it brought up the error code 110 was also mentioned.
>> It
>> should be noted we do not see the same pattern. errno 110 does show but
>> outside of the window when the problem begins and is at its worst. We
>> believe this issue is not configuration related.
>>
>> We're posting this here to try and gather more data. Our process prohibits
>> an upgrade of any kind without plenty of evidence supporting our position.
>> Hoping that some individuals might have seen this particular issue, or if
>> there is any data on whether this could be a bug. Hopefully we're correct
>> in thinking this issue is not a configuration problem, but we'll help to
>> rule out.
>>
>
> It could also be interesting to capture the output of "netstat -an" during
> the time that the problem happens. And finally the same on the Tomcat side
> as well as a thread dump of the Tomcat JVM.
>
> Once you provide the full log line, I can check, whether the 1.2.41 code
> actually has improvements related to errno 115 or not. Knowing the place in
> the code where the error occurs might also give us an idea, what might have
> happened and how to check further (e.g. Tomcat not accepting connections,
> firewall idle connection drop between mod_jk and Tomcat etc. etc.).
>
> Regards,
>
> Rainer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

Reply via email to