If your health check is making HTTP calls it's possible you're not completely consuming the response. If you don't completely consume the response the connection gets left in such a state that it doesn't go back into the pool. As a result, the pool eventually becomes exhausted and subsequent health checks can't get a connection.
That's where I'd start. I vaguely remember there's a method you can call on the Apache/Jersey client to manually exhaust the request, but I don't recall what it is offhand. I just remember it from when we shot ourselves in the foot this way. Ryan On Fri, Aug 26, 2016 at 2:39 AM Jochen Schalanda <[email protected]> wrote: > Hi Ayache, > > you probably want to send this email to the metrics-user mailing list ( > https://groups.google.com/forum/#!forum/metrics-user). > > Cheers, > Jochen > > Am 26.08.2016 um 10:11 schrieb Ayache Khettar < > [email protected]>: > > Hi > > I am experiencing something strange in relation to healthCheck. We deploy > an application (Service A) to Tomcat server which is dependent on Service > B, the healthcheck endpoint works fine, after a while (over 24 hours) the > healthcheck starts reporting time out connection to one of the dependent > Rest Service (Service B) that we monitor via the healthcheck API. Running > curl command against the dependent service, it responds with healthy > status. So the healthCheck endpoint of Service B works through command line > but not when invoked through Service A. Reloading Service A through Tomcat > admin or restarting Tomcat application, the issue seems to go a way and > crops up again after a day or so. > > Any idea how to resolve this? > > Many thanks in advance > > Ayache > > > Ayaches-MacBook-Pro:~ ayache$ http GET > localhost:8080/ipt-orchestra-integration/admin/healthcheck > HTTP/1.1 200 OK > Cache-Control: must-revalidate,no-cache,no-store > Content-Length: 249 > Content-Type: application/json > Date: Fri, 26 Aug 2016 07:57:27 GMT > Server: Apache-Coyote/1.1 > > { > "Service F": { > "healthy": true > }, > "deadlocks": { > "healthy": true > }, > "ServiceB": { > "healthy": false, > "message": "Timed out after 10 seconds waiting for response from > Orchestra" > }, > "referenceData": { > "healthy": true > }, > "remoteCacheHealthCheck": { > "healthy": true > } > } > > -- > You received this message because you are subscribed to the Google Groups > "dropwizard-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > > > -- > You received this message because you are subscribed to the Google Groups > "dropwizard-user" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "dropwizard-user" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
