On Thu, 2016-05-05 at 07:12 -0700, Aaron Curley wrote: > On 5/3/2016 9:49 AM, Oleg Kalnichevski wrote: > > On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote: > >> On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote: > >>> On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote: > >>>> Hello all, > >>>> > >>>> I'm using HttpClient version 4.5.1 to make high-volume calls to an > >>>> internal highly-available web service that is hosted at multiple > >>>> geographic locations. On average, the web service calls need to be > >>>> executed rather quickly; therefore, we are presently making heavy use of > >>>> HttpClient's connection pooling & reuse behavior (via > >>>> PoolingHttpClientConnectionManager) to alleviate the overhead (TCP and > >>>> TLS) incurred when setting up a new connection. Our DNS records direct > >>>> clients of the service to the (geographically) nearest data center. > >>>> Should a data center experience problems, we update our DNS records to > >>>> redirect our clients to our remaining "good" site(s). > >>>> > >>>> In times of an outage, HttpClient appears to have automatically > >>>> failed-over to our alternate sites (IPs) pretty timely and consistently; > >>>> however, upon service restoration at the primary data-center for a > >>>> particular client node, HttpClient does not appear to reliably switch > >>>> back to calling our primary site. (In such instances, I have been able > >>>> to confirm that our DNS records ARE switching back in a timely manner. > >>>> The TTLs are set to an appropriately short value.) > >>>> > >>>> My best guess is that the lack of consistent "switch back" to the > >>>> primary site is caused by HttpClient's connection pooling & reuse > >>>> behavior. Because of our relatively high (and consistent) request > >>>> volume, creation of NEW connections is likely a rarity once a particular > >>>> client node is operating for a while; instead, as previously noted, I > >>>> would generally expect that connections in the pool be re-used whenever > >>>> possible. Any re-used connections will still be established with the > >>>> alternate site(s), therefore, the client nodes communicating with > >>>> alternate sites would generally never (or only VERY gradually) switch > >>>> back to communicating with the primary site. This lines up with what I > >>>> have observed; "switching back" seems to only happen once request > >>>> throughput drops sufficiently to allow most of our client nodes' pooled > >>>> connections to "time out" and be closed due to inactivity (e.g. during > >>>> overnight hours). > >>>> > >>>> > >>>> I believe a reasonably "standard" way to solve this problem would be to > >>>> configure a maximum 'lifetime' of a connection in the connection pool > >>>> (e.g. 1 hour). This lifetime would be enforced regardless of whether or > >>>> not the connection is idle or can otherwise be re-used. On first > >>>> glance, the HttpClientBuilder.setConnectionTimeToLive() method seemed > >>>> ideal for this, but upon further investigation and review of the > >>>> HttpClient code base, this method appears to configure the maximum TTL > >>>> without introducing any element of randomness into each connection's > >>>> TTL. As a result, I'm concerned that if I enable the built-in TTL > >>>> feature, my clients are likely to experience regular performance > >>>> "spikes" at the configured TTL interval. (This would be caused when > >>>> most/all of the pooled connections expire simultaneously, since they > >>>> were mostly all created at once, at application start-up.) > >>>> > >>>> > >>>> Instead of using the built-in TTL limit setting, I considered overriding > >>>> the time to live using the available callbacks > >>>> (ConnectionKeepAliveStrategy, ConnectionReuseStrategy). Unfortunately, > >>>> this approach appears to be infeasible because the parameters to those > >>>> callbacks do not have access to the underlying connection object; there > >>>> is no "key" that can be used to look up a connection's lifetime (i.e. in > >>>> a ConcurrentHashMap) such that a decision could be made about whether to > >>>> close or retain the connection. I also took a look at the various > >>>> places that I could try to override the default connection pooling > >>>> behavior (e.g. MainClientExec, HttpClientConnectionManager). > >>>> HttpClientConnectionManager appears to be the best bet (specifically, > >>>> HttpClientConnectionManager.releaseConnection() would have to be > >>>> overridden), but this would require duplication of the existing > >>>> releaseConnection() code with slight modifications in the overriding > >>>> class. This seems brittle. > >>>> > >>>> > >>>> Can anyone think of a way that I could implement this (cleanly) with > >>>> HttpClient 4.x? Maybe I missed something? > >>>> > >>>> If not, I would be happy to open a JIRA for a possible HttpClient > >>>> enhancement to enable such a feature. If people are open to this idea, > >>>> I was generally thinking that adding a more generic callback might be > >>>> the best approach (since my use case may not be other people's use > >>>> cases), but I could also be convinced to make the enhancement request > >>>> specifically support a connection expiration "window" for the TTL > >>>> feature instead of a specific "hard-limit". Any thoughts on this? > >>>> > >>>> > >>>> Thanks in advance (and sorry for the long email)! > >>>> > >>> Hi Aaron > >>> > >>> PoolingHttpClientConnectionManager does not pro-actively evict expired > >>> connections by default. I think it is unlikely that connections with a > >>> finite TTL would all get evicted at the same time. Having said that you > >>> are welcome to contribute a patch enabling TTL setting on a per pool > >>> entry basis. > >>> > >>> Oleg > >>> > >> Hi Oleg, > >> > >> Thanks for your response. I would be happy to submit a patch, but > >> before doing so, I'd want to ensure my concerns about the current TTL > >> implementation are actually real. > >> > >> Due to the lack of background threads, the connection pool would only > >> remove an expired connection synchronously (i.e. after a request using > >> that connection was completed), right? > > Not quite. Connection validity is checked upon connection lease, not > > upon connection release. > > > >> When you say > >> "PoolingHttpClientConnectionManager does not pro-actively evict expired > >> connections ... ", are you referring to this synchronous eviction > >> behavior or something else? > >> > > Yes. > > > >> If the former, my concern is that our clients perform multiple > >> transactions per second. If a connection pool has, say, 50 connections > >> (that were mostly created very closely to each other, due to a high > >> throughput at startup), and my client is executing 25 requests/sec, then > >> theoretically, once the TTL expires, all connections in the pool would > >> likely have been used within a few seconds (and thus closed as > >> expired). Am I off-base? > >> > > No, I think you are not. However for such a condition to occur one needs > > to completely exhaust the entire pool of connection literally > > instantaneously, which is not very likely (*), because the pool manager > > always tries to reuse newest connections first > > > > (*) The only problem might be slow setup of TLS encrypted connections. > > > > Oleg > > > >> Regards, > >> > >> Aaron Curley > >> > >> > Hi Oleg, > > Thanks for the clarifications. I am indeed using TLS which is (partly) > why I'm so concerned about the new connection set-up time. > > Because of my rather high number of operations per second, I'm still a > bit worried about exhausting a large percentage of connections in the > pool over a short period of time. I will see if I can do some testing > to see if this is a problem in reality. > > Planning ahead, if I was to submit a patch, is the HttpClient 4.x branch > still open for enhancements (or is it mainly just accepting bug fixes at > this point)? >
Hi Aaron The 4.5.x branch should be used for bug fix releases only. We ought not add new features to it. Theoretically we could could create 4.6.x branch but I would very much prefer not to. There will be no problem adding this feature to 5.0 (trunk). Oleg --------------------------------------------------------------------- To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org For additional commands, e-mail: httpclient-users-h...@hc.apache.org