[
https://issues.apache.org/jira/browse/CXF-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15643744#comment-15643744
]
William Montaz commented on CXF-7122:
-------------------------------------
Hi Freeman, Sergey,
The test you provide is a good one. But running it I saw something weird ->
even when timeouting, the requests where made against the server and that led
me to the reason why the patch I propose was not working on 3.1.xxx or
3.2-SNAPSHOT
Actually, I first tested my patch on version 3.0.5 of CXF and when backporting
to 3.1.x I did not realised a little thing changed in the AsyncHTTPConduit. As
you can notice, on version 3.0.5
https://github.com/apache/cxf/blob/cxf-3.0.5/rt/transports/http-hc/src/main/java/org/apache/cxf/transport/http/asyncclient/AsyncHTTPConduit.java
line 220, the RequestConfig sets a SocketTimeout equals to ReceiveTimeout.
Since my patch relies on using the callback to timeout, it could not work
because the underlying socket was not timed out. I think it is also the reason
why ReadTimeout was not used when async,
I updated my merge request to add SocketTimeout. I also notice there is a
ConnectionRequestTimeout, that helps dealing whit waiting for a connection for
too long when the pool is exhausted. In my merge request I propose to set it to
ReceiveTimeout too, to stay close to the behavior proposed by CXF. But we could
actually set a new property to distinguish both.
I remove the timer task from Freeman's commit, just as a comment. You want also
want to clean this code in a better way.
Last thing, my project uses version 3.0.x, could we consider backporting this
patch on 3.0.x ?
Thanks
William
> Infinite loop due to AsyncHTTPConduit read timeout with exhausted connection
> pool
> ---------------------------------------------------------------------------------
>
> Key: CXF-7122
> URL: https://issues.apache.org/jira/browse/CXF-7122
> Project: CXF
> Issue Type: Bug
> Components: Transports
> Reporter: William Montaz
> Assignee: Freeman Fang
> Priority: Critical
> Fix For: 3.2.0, 3.1.9
>
> Attachments: AsyncHTTPConduitTest.java
>
>
> Using AsyncHTTPConduit, when the underlying connection pool gets exhausted,
> requests waiting for a connection will lead to an infinite loop if they reach
> receive timeout.
> The problem occured on all versions of CXF above 3.0.5 (we did not tested
> other ones).
> Let's imagine a backend that's broken and leads to timeout for all requests.
> When handling requests, the cxf worker thread will eventually go in wait
> state (AsyncHTTPConduit:618), with a timeout that matches the
> HTTPClientPolicy.setReceiveTimeout() value, waiting for the NIO stack to
> complete and call notifyAll via responseCallback (AsyncHTTPConduit:455).
> The timeout on the wait is the big problem :
> With our broken backend, the connection pool is exhausted waiting for other
> requests to timeout. When a new request is made by cxf against this backend,
> after timeout time this will happen :
> - on the one side the reactor threads will get a connection from the pool
> and try to write to the output stream. Waiting in the pool is not considered
> as receive timeout.
> - on the other side the cxf worker thread will wake up (because of the
> timedout wait), and shutdown SharedOutputBuffer and SharedInputBuffer
> (AsyncHTTPClient:624)
> - reactor threads will go to infinite loop because they will try to
> produceContent from a shutdown buffer (SharedOutputBuffer:120)
>
> From there, application recovery is compromised.
>
> To fix that, timeout should be handled only via the client callback
> (AsyncHTTPConduit:463).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)