On Thu, 2016-05-05 at 07:12 -0700, Aaron Curley wrote:
> On 5/3/2016 9:49 AM, Oleg Kalnichevski wrote:
> > On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote:
> >> On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:
> >>> On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:
> >>>> Hello all,
> >>>>
> >>>> I'm using HttpClient version 4.5.1 to make high-volume calls to an 
> >>>> internal highly-available web service that is hosted at multiple 
> >>>> geographic locations.  On average, the web service calls need to be 
> >>>> executed rather quickly; therefore, we are presently making heavy use of 
> >>>> HttpClient's connection pooling & reuse behavior (via 
> >>>> PoolingHttpClientConnectionManager) to alleviate the overhead (TCP and 
> >>>> TLS) incurred when setting up a new connection.  Our DNS records direct 
> >>>> clients of the service to the (geographically) nearest data center. 
> >>>> Should a data center experience problems, we update our DNS records to 
> >>>> redirect our clients to our remaining "good" site(s).
> >>>>
> >>>> In times of an outage, HttpClient appears to have automatically 
> >>>> failed-over to our alternate sites (IPs) pretty timely and consistently; 
> >>>> however, upon service restoration at the primary data-center for a 
> >>>> particular client node, HttpClient does not appear to reliably switch 
> >>>> back to calling our primary site.  (In such instances, I have been able 
> >>>> to confirm that our DNS records ARE switching back in a timely manner.  
> >>>> The TTLs are set to an appropriately short value.)
> >>>>
> >>>> My best guess is that the lack of consistent "switch back" to the 
> >>>> primary site is caused by HttpClient's connection pooling & reuse 
> >>>> behavior.  Because of our relatively high (and consistent) request 
> >>>> volume, creation of NEW connections is likely a rarity once a particular 
> >>>> client node is operating for a while; instead, as previously noted, I 
> >>>> would generally expect that connections in the pool be re-used whenever 
> >>>> possible.  Any re-used connections will still be established with the 
> >>>> alternate site(s), therefore, the client nodes communicating with 
> >>>> alternate sites would generally never (or only VERY gradually) switch 
> >>>> back to communicating with the primary site.  This lines up with what I 
> >>>> have observed; "switching back" seems to only happen once request 
> >>>> throughput drops sufficiently to allow most of our client nodes' pooled 
> >>>> connections to "time out" and be closed due to inactivity (e.g. during 
> >>>> overnight hours).
> >>>>
> >>>>
> >>>> I believe a reasonably "standard" way to solve this problem would be to 
> >>>> configure a maximum 'lifetime' of a connection in the connection pool 
> >>>> (e.g. 1 hour).  This lifetime would be enforced regardless of whether or 
> >>>> not the connection is idle or can otherwise be re-used.  On first 
> >>>> glance, the HttpClientBuilder.setConnectionTimeToLive() method seemed 
> >>>> ideal for this, but upon further investigation and review of the 
> >>>> HttpClient code base, this method appears to configure the maximum TTL 
> >>>> without introducing any element of randomness into each connection's 
> >>>> TTL.  As a result, I'm concerned that if I enable the built-in TTL 
> >>>> feature, my clients are likely to experience regular performance 
> >>>> "spikes" at the configured TTL interval.  (This would be caused when 
> >>>> most/all of the pooled connections expire simultaneously, since they 
> >>>> were mostly all created at once, at application start-up.)
> >>>>
> >>>>
> >>>> Instead of using the built-in TTL limit setting, I considered overriding 
> >>>> the time to live using the available callbacks 
> >>>> (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  Unfortunately, 
> >>>> this approach appears to be infeasible because the parameters to those 
> >>>> callbacks do not have access to the underlying connection object; there 
> >>>> is no "key" that can be used to look up a connection's lifetime (i.e. in 
> >>>> a ConcurrentHashMap) such that a decision could be made about whether to 
> >>>> close or retain the connection.  I also took a look at the various 
> >>>> places that I could try to override the default connection pooling 
> >>>> behavior (e.g. MainClientExec, HttpClientConnectionManager). 
> >>>> HttpClientConnectionManager appears to be the best bet (specifically, 
> >>>> HttpClientConnectionManager.releaseConnection() would have to be 
> >>>> overridden), but this would require duplication of the existing 
> >>>> releaseConnection() code with slight modifications in the overriding 
> >>>> class. This seems brittle.
> >>>>
> >>>>
> >>>> Can anyone think of a way that I could implement this (cleanly) with 
> >>>> HttpClient 4.x?  Maybe I missed something?
> >>>>
> >>>> If not, I would be happy to open a JIRA for a possible HttpClient 
> >>>> enhancement to enable such a feature.  If people are open to this idea, 
> >>>> I was generally thinking that adding a more generic callback might be 
> >>>> the best approach (since my use case may not be other people's use 
> >>>> cases), but I could also be convinced to make the enhancement request 
> >>>> specifically support a connection expiration "window" for the TTL 
> >>>> feature instead of a specific "hard-limit".  Any thoughts on this?
> >>>>
> >>>>
> >>>> Thanks in advance (and sorry for the long email)!
> >>>>
> >>> Hi Aaron
> >>>
> >>> PoolingHttpClientConnectionManager does not pro-actively evict expired
> >>> connections by default. I think it is unlikely that connections with a
> >>> finite TTL would all get evicted at the same time. Having said that you
> >>> are welcome to contribute a patch enabling TTL setting on a per pool
> >>> entry basis.
> >>>
> >>> Oleg
> >>>
> >> Hi Oleg,
> >>
> >> Thanks for your response.  I would be happy to submit a patch, but
> >> before doing so, I'd want to ensure my concerns about the current TTL
> >> implementation are actually real.
> >>
> >> Due to the lack of background threads, the connection pool would only
> >> remove an expired connection synchronously (i.e. after a request using
> >> that connection was completed), right?
> > Not quite. Connection validity is checked upon connection lease, not
> > upon connection release.
> >
> >>   When you say
> >> "PoolingHttpClientConnectionManager does not pro-actively evict expired
> >> connections ... ", are you referring to this synchronous eviction
> >> behavior or something else?
> >>
> > Yes.
> >
> >> If the former, my concern is that our clients perform multiple
> >> transactions per second.  If a connection pool has, say, 50 connections
> >> (that were mostly created very closely to each other, due to a high
> >> throughput at startup), and my client is executing 25 requests/sec, then
> >> theoretically, once the TTL expires, all connections in the pool would
> >> likely have been used within a few seconds (and thus closed as
> >> expired).  Am I off-base?
> >>
> > No, I think you are not. However for such a condition to occur one needs
> > to completely exhaust the entire pool of connection literally
> > instantaneously, which is not very likely (*), because the pool manager
> > always tries to reuse newest connections first
> >
> > (*) The only problem might be slow setup of TLS encrypted connections.
> >
> > Oleg
> >
> >> Regards,
> >>
> >> Aaron Curley
> >>
> >>
> Hi Oleg,
> 
> Thanks for the clarifications.  I am indeed using TLS which is (partly) 
> why I'm so concerned about the new connection set-up time.
> 
> Because of my rather high number of operations per second, I'm still a 
> bit worried about exhausting a large percentage of connections in the 
> pool over a short period of time.  I will see if I can do some testing 
> to see if this is a problem in reality.
> 
> Planning ahead, if I was to submit a patch, is the HttpClient 4.x branch 
> still open for enhancements (or is it mainly just accepting bug fixes at 
> this point)?
> 

Hi Aaron

The 4.5.x branch should be used for bug fix releases only. We ought not
add new features to it. Theoretically we could could create 4.6.x branch
but I would very much prefer not to. There will be no problem adding
this feature to 5.0 (trunk).

Oleg


---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
For additional commands, e-mail: httpclient-users-h...@hc.apache.org

Reply via email to