I do have a thought - sometimes its just important to make it work, not to
drill into perfectly what it is - I have had some luck with the
Resilience4J library, where I set it up on a short timeout - connections
that work work within, say, 30ms - and if they take more than 50ms, they're
never coming back on my measurements. So - I set the timeout to 50ms, the
delay to 10ms, and the retry count to 3.  It can make noise every time it
retries - but thanks to the squirrely behavior I was seeing under load,
which seemed more infrastructure than server based, the retry solved the
problem neatly.

The similar sounding problem I had was spates of rest calls between
services at different cloud providers going through different load
balance/api management system layers.  It seemed to me that a low
percentage of my attempts to connect would just disappear - like a packet
loss or a process crash on the load balance - remote service would never
see a packet.

Just a thought, hope it helps.  I love to aggressively chase solutions.

David



On Fri, Mar 15, 2024 at 4:58 PM Richard Tippl <richard.ti...@gmail.com>
wrote:

> Hello,
>
> I am supporting a Spring Boot application, which uses HttpClient 5 in the
> background. We're mainly using PoolingHttpClientConnectionManager to send a
> large amount of requests to a target server.
>
> We're experiencing some network issues (socket connection timeouts during
> high load scenarios) and in trying to locate them, I've begun the process
> of trying to look into what actually happens during connection
> establishment.
> My idea was to measure the time it takes for certain steps taken when
> creating a connection. Mainly I wanted to measure TCP socket open and SSL
> handshake.
>
> The initial version I've come up with uses (abuses) the
> ConnectionSocketFactory interface, wrapping it in a way to measure the
> length of execution for connectSocket. This gives the sum of TCP open and
> SSL handshake.
> This way I can at least get some numbers and use them to help with locating
> and resolving the issues.
>
> There are 2 issues with this approach, as far as i can tell, I can't
> measure these times separately, and in the newest alpha version of 5.4 the
> interface I'm using has been deprecated and replaced by the
> DefaultHttpClientConnectionOperator, which performs all of the connection
> steps in a single method call.
>
> Am I missing some easier way to plug into the flow of creating a connection
> and getting the ability to measure what I wish to measure? Will it still be
> possible after the deprecated interfaces get removed? Is there a way I
> could measure both socket open and SSL handshake separately?
> The metrics I've achieved so far already started showing us certain trends
> and extending them could help us more in trying to solve these issues.
>
> Thanks for responding.
>
> Richard
>


-- 
Dog approved this message.

Reply via email to