Re: HttpClient with a web service: Consistent data center switch-over

Oleg Kalnichevski Tue, 03 May 2016 09:50:34 -0700

On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote:
> On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:
> > On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:
> >> Hello all,
> >>
> >> I'm using HttpClient version 4.5.1 to make high-volume calls to an 
> >> internal highly-available web service that is hosted at multiple 
> >> geographic locations.  On average, the web service calls need to be 
> >> executed rather quickly; therefore, we are presently making heavy use of 
> >> HttpClient's connection pooling & reuse behavior (via 
> >> PoolingHttpClientConnectionManager) to alleviate the overhead (TCP and 
> >> TLS) incurred when setting up a new connection.  Our DNS records direct 
> >> clients of the service to the (geographically) nearest data center. Should 
> >> a data center experience problems, we update our DNS records to redirect 
> >> our clients to our remaining "good" site(s).
> >>
> >> In times of an outage, HttpClient appears to have automatically 
> >> failed-over to our alternate sites (IPs) pretty timely and consistently; 
> >> however, upon service restoration at the primary data-center for a 
> >> particular client node, HttpClient does not appear to reliably switch back 
> >> to calling our primary site.  (In such instances, I have been able to 
> >> confirm that our DNS records ARE switching back in a timely manner.  The 
> >> TTLs are set to an appropriately short value.)
> >>
> >> My best guess is that the lack of consistent "switch back" to the primary 
> >> site is caused by HttpClient's connection pooling & reuse behavior.  
> >> Because of our relatively high (and consistent) request volume, creation 
> >> of NEW connections is likely a rarity once a particular client node is 
> >> operating for a while; instead, as previously noted, I would generally 
> >> expect that connections in the pool be re-used whenever possible.  Any 
> >> re-used connections will still be established with the alternate site(s), 
> >> therefore, the client nodes communicating with alternate sites would 
> >> generally never (or only VERY gradually) switch back to communicating with 
> >> the primary site.  This lines up with what I have observed; "switching 
> >> back" seems to only happen once request throughput drops sufficiently to 
> >> allow most of our client nodes' pooled connections to "time out" and be 
> >> closed due to inactivity (e.g. during overnight hours).
> >>
> >>
> >> I believe a reasonably "standard" way to solve this problem would be to 
> >> configure a maximum 'lifetime' of a connection in the connection pool 
> >> (e.g. 1 hour).  This lifetime would be enforced regardless of whether or 
> >> not the connection is idle or can otherwise be re-used.  On first glance, 
> >> the HttpClientBuilder.setConnectionTimeToLive() method seemed ideal for 
> >> this, but upon further investigation and review of the HttpClient code 
> >> base, this method appears to configure the maximum TTL without introducing 
> >> any element of randomness into each connection's TTL.  As a result, I'm 
> >> concerned that if I enable the built-in TTL feature, my clients are likely 
> >> to experience regular performance "spikes" at the configured TTL interval. 
> >>  (This would be caused when most/all of the pooled connections expire 
> >> simultaneously, since they were mostly all created at once, at application 
> >> start-up.)
> >>
> >>
> >> Instead of using the built-in TTL limit setting, I considered overriding 
> >> the time to live using the available callbacks 
> >> (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  Unfortunately, 
> >> this approach appears to be infeasible because the parameters to those 
> >> callbacks do not have access to the underlying connection object; there is 
> >> no "key" that can be used to look up a connection's lifetime (i.e. in a 
> >> ConcurrentHashMap) such that a decision could be made about whether to 
> >> close or retain the connection.  I also took a look at the various places 
> >> that I could try to override the default connection pooling behavior (e.g. 
> >> MainClientExec, HttpClientConnectionManager). HttpClientConnectionManager 
> >> appears to be the best bet (specifically, 
> >> HttpClientConnectionManager.releaseConnection() would have to be 
> >> overridden), but this would require duplication of the existing 
> >> releaseConnection() code with slight modifications in the overriding 
> >> class. This seems brittle.
> >>
> >>
> >> Can anyone think of a way that I could implement this (cleanly) with 
> >> HttpClient 4.x?  Maybe I missed something?
> >>
> >> If not, I would be happy to open a JIRA for a possible HttpClient 
> >> enhancement to enable such a feature.  If people are open to this idea, I 
> >> was generally thinking that adding a more generic callback might be the 
> >> best approach (since my use case may not be other people's use cases), but 
> >> I could also be convinced to make the enhancement request specifically 
> >> support a connection expiration "window" for the TTL feature instead of a 
> >> specific "hard-limit".  Any thoughts on this?
> >>
> >>
> >> Thanks in advance (and sorry for the long email)!
> >>
> > Hi Aaron
> >
> > PoolingHttpClientConnectionManager does not pro-actively evict expired
> > connections by default. I think it is unlikely that connections with a
> > finite TTL would all get evicted at the same time. Having said that you
> > are welcome to contribute a patch enabling TTL setting on a per pool
> > entry basis.
> >
> > Oleg
> >
> Hi Oleg,
> 
> Thanks for your response.  I would be happy to submit a patch, but 
> before doing so, I'd want to ensure my concerns about the current TTL 
> implementation are actually real.
> 
> Due to the lack of background threads, the connection pool would only 
> remove an expired connection synchronously (i.e. after a request using 
> that connection was completed), right?


Not quite. Connection validity is checked upon connection lease, not
upon connection release.

>  When you say 
> "PoolingHttpClientConnectionManager does not pro-actively evict expired 
> connections ... ", are you referring to this synchronous eviction 
> behavior or something else?
> 

Yes.

> If the former, my concern is that our clients perform multiple 
> transactions per second.  If a connection pool has, say, 50 connections 
> (that were mostly created very closely to each other, due to a high 
> throughput at startup), and my client is executing 25 requests/sec, then 
> theoretically, once the TTL expires, all connections in the pool would 
> likely have been used within a few seconds (and thus closed as 
> expired).  Am I off-base?
> 

No, I think you are not. However for such a condition to occur one needs
to completely exhaust the entire pool of connection literally
instantaneously, which is not very likely (*), because the pool manager
always tries to reuse newest connections first  

(*) The only problem might be slow setup of TLS encrypted connections.

Oleg 

> Regards,
> 
> Aaron Curley
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
> For additional commands, e-mail: httpclient-users-h...@hc.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
For additional commands, e-mail: httpclient-users-h...@hc.apache.org

Re: HttpClient with a web service: Consistent data center switch-over

Reply via email to