Thanks Oleg. I will raise an issue, once I got my JIRA account confirmed.
Unfortunately, I haven’t been successful with a reproducer unit test (already put a lot of effort into it). I can run benchmarks within our company cluster, that occasionally fail due to the stackoverflow. I believe following the stacktrace and reading the code might make the issue clear (especially for someone deeply experienced with the codebase). Happy to support if needed. Cheers, Stephan P.S.: We need the more sophistiaced features and probably can use the minimal/bare clients. From: Oleg Kalnichevski <[email protected]> Date: Friday, 19. September 2025 at 10:37 To: [email protected] <[email protected]> Subject: Re: StackOverflow Issue due to sync loop in connection pool release firecallbacks Hi Stephan Please raise a ticket in the project JIRA for this issue. Please do attach the log exhibiting the probme with the exception stack trace in it to the ticket. And yes, a reproducer of some sort would be very helpful. Unrelated to the problem, one should probably be using minimal HttpClient implementation or even straight HttpCore if they do not need advanced client HTTP features and would like to get the maximum throughput in therms of message exchanges over the same time period. Oleg On 9/19/2025 9:02 AM, Stephan Epping wrote: > Hello, > > I have investigated a tricky error we experienced multiple times in our > benchmarks (that use the httpclient to send thousands of requests to our > clusters). > > We are using a quite up-to-date version (5.3.4 / 5.5.). > > You can find more details about the analysis here <https://github.com/ > camunda/camunda/issues/34597#issuecomment-3301797932>. > > *TLDR;* > > The stacktrace shows a tight synchronous callback cycle inside > HttpComponents' async path that repeatedly alternates between / > completed/ → /release/discard/fail/ → /connect/proceedToNextHop/ → / > completed/, causing unbounded recursion until the JVM stack overflows. > > Concretely the cycle is: > > ·AsyncConnectExec$1.completed → > InternalHttpAsyncExecRuntime$1.completed → BasicFuture.completed > > ·PoolingAsyncClientConnectionManager lease/completed → > *StrictConnPool.fireCallbacks* → StrictConnPool.release → > PoolingAsyncClientConnectionManager.release > > ·InternalHttpAsyncExecRuntime.discardEndpoint → > InternalAbstractHttpAsyncClient$2.failed → AsyncRedirectExec/ > AsyncHttpRequestRetryExec/AsyncProtocolExec/AsyncConnectExec.*failed → > BasicFuture.failed / ComplexFuture.failed → > PoolingAsyncClientConnectionManager$4.failed → > DefaultAsyncClientConnectionOperator$1.failed → > MultihomeIOSessionRequester.connect → > DefaultAsyncClientConnectionOperator.connect → > PoolingAsyncClientConnectionManager.connect → > InternalHttpAsyncExecRuntime.connectEndpoint → > AsyncConnectExec.proceedToNextHop → *back to* AsyncConnectExec$1.completed. > > Because callbacks (completed / failed) are executed synchronously on the > same call stack and some code paths both /complete/ and then trigger / > failed//retry/next-hop connection logic (via pool callbacks and the > connection operator), the call stack never unwinds — recursion depth > grows until StackOverflowError. > > *Possible concrete root causes (detailed)* > > 1.*Synchronous **BasicFuture** callbacks* > BasicFuture.completed() and .failed() call callbacks immediately on the > thread that completes the future. If a callback in turn calls pool > release() which calls fireCallbacks() (synchronously), the chain can re- > enter callback code without unwinding. Re-entrancy depth grows with each > attempted connect/release cycle. > > 2.*Multihome connect tries multiple addresses in the same stack* > MultihomeIOSessionRequester.connect will attempt alternate addresses (A/ > AAAA records). If an address fails quickly and the code immediately > tries the next address by invoking connection manager code and its > callbacks synchronously, you build deeper recursion for each try. > > 3.*Retries/redirects executed synchronously* > The exec chain (redirect → retry → protocol → connect) will call > failed() listeners which in turn call connect again. If those calls are > synchronous, you get direct recursive invocation. > > 4.*Potential omission of an async boundary* > A simple but dangerous pattern is: /complete future/ → /call > listener/ → /listener calls code that completes other futures/ → repeat. > If there is no executor handoff, the recursion remains on the same thread. > > I haven’t been able to create a unit test that reproduces the issue > locally, even though I tried multiple approaches (synthetic http server > that is flaky, randomly failing custom dns resolver, thousands of > requests scheduled, etc.). > > Does someone have an idea what we are doing wrong? Is this a bug or > misconfiguration on our side? We switched now to the `LAX` concurrency > policy which seems to mitigate the issue, but I believe it’s not fixing > the root cause, but makes it less likely. (I can see the lax pool also > has the sync fireCallbacks approach etc.) > > I have attached a stacktrace, but here a brief excerpt (as I don’t know > if attachments work in this mailing list): > > /ERROR 2025-06-30T17:06:10.233570881Z Exception in thread "httpclient- > dispatch-1" java.lang.StackOverflowError/ > > /--------------------------------------------------------------------------------/ > > /ERROR 2025-06-30T17:06:10.299219300Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:164)/ > > /ERROR 2025-06-30T17:06:10.299221745Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:153)/ > > /ERROR 2025-06-30T17:06:10.299223952Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:128)/ > > /ERROR 2025-06-30T17:06:10.299226128Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:120)/ > > /ERROR 2025-06-30T17:06:10.299228287Z at > org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/ > > /ERROR 2025-06-30T17:06:10.299230488Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.leaseCompleted(PoolingAsyncClientConnectionManager.java:339)/ > > /ERROR 2025-06-30T17:06:10.299232722Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:324)/ > > /ERROR 2025-06-30T17:06:10.299234969Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:285)/ > > /ERROR 2025-06-30T17:06:10.299237136Z at > org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/ > > /ERROR 2025-06-30T17:06:10.299239404Z at > org.apache.hc.core5.pool.StrictConnPool.fireCallbacks(StrictConnPool.java:401)/ > > /ERROR 2025-06-30T17:06:10.299241531Z at > org.apache.hc.core5.pool.StrictConnPool.release(StrictConnPool.java:272)/ > > /ERROR 2025-06-30T17:06:10.299243647Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.release(PoolingAsyncClientConnectionManager.java:424)/ > > /ERROR 2025-06-30T17:06:10.299245815Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:156)/ > > /ERROR 2025-06-30T17:06:10.299247909Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:180)/ > > /ERROR 2025-06-30T17:06:10.299250099Z at > org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient$2.failed(InternalAbstractHttpAsyncClient.java:363)/ > > /ERROR 2025-06-30T17:06:10.299252352Z at > org.apache.hc.client5.http.impl.async.AsyncRedirectExec$1.failed(AsyncRedirectExec.java:261)/ > > /ERROR 2025-06-30T17:06:10.299254470Z at > org.apache.hc.client5.http.impl.async.AsyncHttpRequestRetryExec$1.failed(AsyncHttpRequestRetryExec.java:195)/ > > /ERROR 2025-06-30T17:06:10.299256671Z at > org.apache.hc.client5.http.impl.async.AsyncProtocolExec$1.failed(AsyncProtocolExec.java:297)/ > > /ERROR 2025-06-30T17:06:10.299258827Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec$2.failed(AsyncConnectExec.java:235)/ > > /ERROR 2025-06-30T17:06:10.299261062Z at > org.apache.hc.core5.concurrent.CallbackContribution.failed(CallbackContribution.java:52)/ > > /ERROR 2025-06-30T17:06:10.299263164Z at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/ > > /ERROR 2025-06-30T17:06:10.299265335Z at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/ > > /ERROR 2025-06-30T17:06:10.299267498Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$4.failed(PoolingAsyncClientConnectionManager.java:485)/ > > /ERROR 2025-06-30T17:06:10.299273172Z at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/ > > /ERROR 2025-06-30T17:06:10.299275385Z at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/ > > /ERROR 2025-06-30T17:06:10.299277443Z at > org.apache.hc.client5.http.impl.nio.DefaultAsyncClientConnectionOperator$1.failed(DefaultAsyncClientConnectionOperator.java:170)/ > > /ERROR 2025-06-30T17:06:10.299279661Z at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/ > > /ERROR 2025-06-30T17:06:10.299281730Z at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/ > > /ERROR 2025-06-30T17:06:10.299283917Z at > org.apache.hc.client5.http.impl.nio.MultihomeIOSessionRequester.connect(MultihomeIOSessionRequester.java:118)/ > > /ERROR 2025-06-30T17:06:10.299287550Z at > org.apache.hc.client5.http.impl.nio.DefaultAsyncClientConnectionOperator.connect(DefaultAsyncClientConnectionOperator.java:115)/ > > /ERROR 2025-06-30T17:06:10.299290061Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.connect(PoolingAsyncClientConnectionManager.java:456)/ > > /ERROR 2025-06-30T17:06:10.299292128Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.connectEndpoint(InternalHttpAsyncExecRuntime.java:226)/ > > /ERROR 2025-06-30T17:06:10.299294317Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec.doProceedToNextHop(AsyncConnectExec.java:222)/ > > /ERROR 2025-06-30T17:06:10.299296347Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec.proceedToNextHop(AsyncConnectExec.java:197)/ > > /ERROR 2025-06-30T17:06:10.299298506Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec.access$000(AsyncConnectExec.java:92)/ > > /ERROR 2025-06-30T17:06:10.299388975Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:164)/ > > /ERROR 2025-06-30T17:06:10.299401176Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec$1.completed(AsyncConnectExec.java:153)/ > > /ERROR 2025-06-30T17:06:10.299403967Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:128)/ > > /ERROR 2025-06-30T17:06:10.299406043Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime$1.completed(InternalHttpAsyncExecRuntime.java:120)/ > > /ERROR 2025-06-30T17:06:10.299408227Z at > org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/ > > /ERROR 2025-06-30T17:06:10.299410406Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.leaseCompleted(PoolingAsyncClientConnectionManager.java:339)/ > > /ERROR 2025-06-30T17:06:10.299412507Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:324)/ > > /ERROR 2025-06-30T17:06:10.299414382Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$3$1.completed(PoolingAsyncClientConnectionManager.java:285)/ > > /ERROR 2025-06-30T17:06:10.299426976Z at > org.apache.hc.core5.concurrent.BasicFuture.completed(BasicFuture.java:148)/ > > /ERROR 2025-06-30T17:06:10.299429367Z at > org.apache.hc.core5.pool.StrictConnPool.fireCallbacks(StrictConnPool.java:401)/ > > /ERROR 2025-06-30T17:06:10.299431117Z at > org.apache.hc.core5.pool.StrictConnPool.release(StrictConnPool.java:272)/ > > /ERROR 2025-06-30T17:06:10.299433157Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager.release(PoolingAsyncClientConnectionManager.java:424)/ > > /ERROR 2025-06-30T17:06:10.299435324Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:156)/ > > /ERROR 2025-06-30T17:06:10.299437225Z at > org.apache.hc.client5.http.impl.async.InternalHttpAsyncExecRuntime.discardEndpoint(InternalHttpAsyncExecRuntime.java:180)/ > > /ERROR 2025-06-30T17:06:10.299439134Z at > org.apache.hc.client5.http.impl.async.InternalAbstractHttpAsyncClient$2.failed(InternalAbstractHttpAsyncClient.java:363)/ > > /ERROR 2025-06-30T17:06:10.299451280Z at > org.apache.hc.client5.http.impl.async.AsyncRedirectExec$1.failed(AsyncRedirectExec.java:261)/ > > /ERROR 2025-06-30T17:06:10.299453403Z at > org.apache.hc.client5.http.impl.async.AsyncHttpRequestRetryExec$1.failed(AsyncHttpRequestRetryExec.java:195)/ > > /ERROR 2025-06-30T17:06:10.299455096Z at > org.apache.hc.client5.http.impl.async.AsyncProtocolExec$1.failed(AsyncProtocolExec.java:297)/ > > /ERROR 2025-06-30T17:06:10.299456747Z at > org.apache.hc.client5.http.impl.async.AsyncConnectExec$2.failed(AsyncConnectExec.java:235)/ > > /ERROR 2025-06-30T17:06:10.299458431Z at > org.apache.hc.core5.concurrent.CallbackContribution.failed(CallbackContribution.java:52)/ > > /ERROR 2025-06-30T17:06:10.299460343Z at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/ > > /ERROR 2025-06-30T17:06:10.299462062Z at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/ > > /ERROR 2025-06-30T17:06:10.299463899Z at > org.apache.hc.client5.http.impl.nio.PoolingAsyncClientConnectionManager$4.failed(PoolingAsyncClientConnectionManager.java:485)/ > > /ERROR 2025-06-30T17:06:10.299465810Z at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/ > > /ERROR 2025-06-30T17:06:10.299467649Z at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/ > > /ERROR 2025-06-30T17:06:10.299469389Z at > org.apache.hc.client5.http.impl.nio.DefaultAsyncClientConnectionOperator$1.failed(DefaultAsyncClientConnectionOperator.java:170)/ > > /ERROR 2025-06-30T17:06:10.299471284Z at > org.apache.hc.core5.concurrent.BasicFuture.failed(BasicFuture.java:166)/ > > /ERROR 2025-06-30T17:06:10.299473097Z at > org.apache.hc.core5.concurrent.ComplexFuture.failed(ComplexFuture.java:79)/ > > /ERROR 2025-06-30T17:06:10.299474928Z at > org.apache.hc.client5.http.impl.nio.MultihomeIOSessionRequester.connect(MultihomeIOSessionRequester.java:118)/ > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
