Re: HttpClient with a web service: Consistent data center switch-over

2016-05-06 Thread Oleg Kalnichevski
On Fri, 2016-05-06 at 07:17 -0700, Aaron Curley wrote:
> On 5/6/2016 5:15 AM, Oleg Kalnichevski wrote:
> > On Thu, 2016-05-05 at 07:12 -0700, Aaron Curley wrote:
> >> On 5/3/2016 9:49 AM, Oleg Kalnichevski wrote:
> >>> On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote:
>  On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:
> > On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:
> >> Hello all,
> >>
> >> I'm using HttpClient version 4.5.1 to make high-volume calls to an 
> >> internal highly-available web service that is hosted at multiple 
> >> geographic locations.  On average, the web service calls need to be 
> >> executed rather quickly; therefore, we are presently making heavy use 
> >> of HttpClient's connection pooling & reuse behavior (via 
> >> PoolingHttpClientConnectionManager) to alleviate the overhead (TCP and 
> >> TLS) incurred when setting up a new connection.  Our DNS records 
> >> direct clients of the service to the (geographically) nearest data 
> >> center. Should a data center experience problems, we update our DNS 
> >> records to redirect our clients to our remaining "good" site(s).
> >>
> >> In times of an outage, HttpClient appears to have automatically 
> >> failed-over to our alternate sites (IPs) pretty timely and 
> >> consistently; however, upon service restoration at the primary 
> >> data-center for a particular client node, HttpClient does not appear 
> >> to reliably switch back to calling our primary site.  (In such 
> >> instances, I have been able to confirm that our DNS records ARE 
> >> switching back in a timely manner.  The TTLs are set to an 
> >> appropriately short value.)
> >>
> >> My best guess is that the lack of consistent "switch back" to the 
> >> primary site is caused by HttpClient's connection pooling & reuse 
> >> behavior.  Because of our relatively high (and consistent) request 
> >> volume, creation of NEW connections is likely a rarity once a 
> >> particular client node is operating for a while; instead, as 
> >> previously noted, I would generally expect that connections in the 
> >> pool be re-used whenever possible.  Any re-used connections will still 
> >> be established with the alternate site(s), therefore, the client nodes 
> >> communicating with alternate sites would generally never (or only VERY 
> >> gradually) switch back to communicating with the primary site.  This 
> >> lines up with what I have observed; "switching back" seems to only 
> >> happen once request throughput drops sufficiently to allow most of our 
> >> client nodes' pooled connections to "time out" and be closed due to 
> >> inactivity (e.g. during overnight hours).
> >>
> >>
> >> I believe a reasonably "standard" way to solve this problem would be 
> >> to configure a maximum 'lifetime' of a connection in the connection 
> >> pool (e.g. 1 hour).  This lifetime would be enforced regardless of 
> >> whether or not the connection is idle or can otherwise be re-used.  On 
> >> first glance, the HttpClientBuilder.setConnectionTimeToLive() method 
> >> seemed ideal for this, but upon further investigation and review of 
> >> the HttpClient code base, this method appears to configure the maximum 
> >> TTL without introducing any element of randomness into each 
> >> connection's TTL.  As a result, I'm concerned that if I enable the 
> >> built-in TTL feature, my clients are likely to experience regular 
> >> performance "spikes" at the configured TTL interval.  (This would be 
> >> caused when most/all of the pooled connections expire simultaneously, 
> >> since they were mostly all created at once, at application start-up.)
> >>
> >>
> >> Instead of using the built-in TTL limit setting, I considered 
> >> overriding the time to live using the available callbacks 
> >> (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  
> >> Unfortunately, this approach appears to be infeasible because the 
> >> parameters to those callbacks do not have access to the underlying 
> >> connection object; there is no "key" that can be used to look up a 
> >> connection's lifetime (i.e. in a ConcurrentHashMap) such that a 
> >> decision could be made about whether to close or retain the 
> >> connection.  I also took a look at the various places that I could try 
> >> to override the default connection pooling behavior (e.g. 
> >> MainClientExec, HttpClientConnectionManager). 
> >> HttpClientConnectionManager appears to be the best bet (specifically, 
> >> HttpClientConnectionManager.releaseConnection() would have to be 
> >> overridden), but this would require duplication of the existing 
> >> releaseConnection() code with slight modifications in the overriding 
> >> class. This seems brittle.
> >>
> 

Re: HttpClient with a web service: Consistent data center switch-over

2016-05-06 Thread Aaron Curley

On 5/6/2016 5:15 AM, Oleg Kalnichevski wrote:

On Thu, 2016-05-05 at 07:12 -0700, Aaron Curley wrote:

On 5/3/2016 9:49 AM, Oleg Kalnichevski wrote:

On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote:

On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:

On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:

Hello all,

I'm using HttpClient version 4.5.1 to make high-volume calls to an internal highly-available 
web service that is hosted at multiple geographic locations.  On average, the web service 
calls need to be executed rather quickly; therefore, we are presently making heavy use of 
HttpClient's connection pooling & reuse behavior (via PoolingHttpClientConnectionManager) 
to alleviate the overhead (TCP and TLS) incurred when setting up a new connection.  Our DNS 
records direct clients of the service to the (geographically) nearest data center. Should a 
data center experience problems, we update our DNS records to redirect our clients to our 
remaining "good" site(s).

In times of an outage, HttpClient appears to have automatically failed-over to 
our alternate sites (IPs) pretty timely and consistently; however, upon service 
restoration at the primary data-center for a particular client node, HttpClient 
does not appear to reliably switch back to calling our primary site.  (In such 
instances, I have been able to confirm that our DNS records ARE switching back 
in a timely manner.  The TTLs are set to an appropriately short value.)

My best guess is that the lack of consistent "switch back" to the primary site is caused by 
HttpClient's connection pooling & reuse behavior.  Because of our relatively high (and consistent) request 
volume, creation of NEW connections is likely a rarity once a particular client node is operating for a while; 
instead, as previously noted, I would generally expect that connections in the pool be re-used whenever possible. 
 Any re-used connections will still be established with the alternate site(s), therefore, the client nodes 
communicating with alternate sites would generally never (or only VERY gradually) switch back to communicating 
with the primary site.  This lines up with what I have observed; "switching back" seems to only happen 
once request throughput drops sufficiently to allow most of our client nodes' pooled connections to "time 
out" and be closed due to inactivity (e.g. during overnight hours).


I believe a reasonably "standard" way to solve this problem would be to configure a 
maximum 'lifetime' of a connection in the connection pool (e.g. 1 hour).  This lifetime would be 
enforced regardless of whether or not the connection is idle or can otherwise be re-used.  On first 
glance, the HttpClientBuilder.setConnectionTimeToLive() method seemed ideal for this, but upon 
further investigation and review of the HttpClient code base, this method appears to configure the 
maximum TTL without introducing any element of randomness into each connection's TTL.  As a result, 
I'm concerned that if I enable the built-in TTL feature, my clients are likely to experience 
regular performance "spikes" at the configured TTL interval.  (This would be caused when 
most/all of the pooled connections expire simultaneously, since they were mostly all created at 
once, at application start-up.)


Instead of using the built-in TTL limit setting, I considered overriding the time to live 
using the available callbacks (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  
Unfortunately, this approach appears to be infeasible because the parameters to those 
callbacks do not have access to the underlying connection object; there is no 
"key" that can be used to look up a connection's lifetime (i.e. in a 
ConcurrentHashMap) such that a decision could be made about whether to close or retain 
the connection.  I also took a look at the various places that I could try to override 
the default connection pooling behavior (e.g. MainClientExec, 
HttpClientConnectionManager). HttpClientConnectionManager appears to be the best bet 
(specifically, HttpClientConnectionManager.releaseConnection() would have to be 
overridden), but this would require duplication of the existing releaseConnection() code 
with slight modifications in the overriding class. This seems brittle.


Can anyone think of a way that I could implement this (cleanly) with HttpClient 
4.x?  Maybe I missed something?

If not, I would be happy to open a JIRA for a possible HttpClient enhancement to enable such a 
feature.  If people are open to this idea, I was generally thinking that adding a more generic 
callback might be the best approach (since my use case may not be other people's use cases), but I 
could also be convinced to make the enhancement request specifically support a connection 
expiration "window" for the TTL feature instead of a specific "hard-limit".  
Any thoughts on this?


Thanks in advance (and sorry for the long email)!


Hi Aaron

PoolingHttpClientConnectionManager does not pro-actively 

Re: HttpClient with a web service: Consistent data center switch-over

2016-05-06 Thread Oleg Kalnichevski
On Thu, 2016-05-05 at 07:12 -0700, Aaron Curley wrote:
> On 5/3/2016 9:49 AM, Oleg Kalnichevski wrote:
> > On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote:
> >> On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:
> >>> On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:
>  Hello all,
> 
>  I'm using HttpClient version 4.5.1 to make high-volume calls to an 
>  internal highly-available web service that is hosted at multiple 
>  geographic locations.  On average, the web service calls need to be 
>  executed rather quickly; therefore, we are presently making heavy use of 
>  HttpClient's connection pooling & reuse behavior (via 
>  PoolingHttpClientConnectionManager) to alleviate the overhead (TCP and 
>  TLS) incurred when setting up a new connection.  Our DNS records direct 
>  clients of the service to the (geographically) nearest data center. 
>  Should a data center experience problems, we update our DNS records to 
>  redirect our clients to our remaining "good" site(s).
> 
>  In times of an outage, HttpClient appears to have automatically 
>  failed-over to our alternate sites (IPs) pretty timely and consistently; 
>  however, upon service restoration at the primary data-center for a 
>  particular client node, HttpClient does not appear to reliably switch 
>  back to calling our primary site.  (In such instances, I have been able 
>  to confirm that our DNS records ARE switching back in a timely manner.  
>  The TTLs are set to an appropriately short value.)
> 
>  My best guess is that the lack of consistent "switch back" to the 
>  primary site is caused by HttpClient's connection pooling & reuse 
>  behavior.  Because of our relatively high (and consistent) request 
>  volume, creation of NEW connections is likely a rarity once a particular 
>  client node is operating for a while; instead, as previously noted, I 
>  would generally expect that connections in the pool be re-used whenever 
>  possible.  Any re-used connections will still be established with the 
>  alternate site(s), therefore, the client nodes communicating with 
>  alternate sites would generally never (or only VERY gradually) switch 
>  back to communicating with the primary site.  This lines up with what I 
>  have observed; "switching back" seems to only happen once request 
>  throughput drops sufficiently to allow most of our client nodes' pooled 
>  connections to "time out" and be closed due to inactivity (e.g. during 
>  overnight hours).
> 
> 
>  I believe a reasonably "standard" way to solve this problem would be to 
>  configure a maximum 'lifetime' of a connection in the connection pool 
>  (e.g. 1 hour).  This lifetime would be enforced regardless of whether or 
>  not the connection is idle or can otherwise be re-used.  On first 
>  glance, the HttpClientBuilder.setConnectionTimeToLive() method seemed 
>  ideal for this, but upon further investigation and review of the 
>  HttpClient code base, this method appears to configure the maximum TTL 
>  without introducing any element of randomness into each connection's 
>  TTL.  As a result, I'm concerned that if I enable the built-in TTL 
>  feature, my clients are likely to experience regular performance 
>  "spikes" at the configured TTL interval.  (This would be caused when 
>  most/all of the pooled connections expire simultaneously, since they 
>  were mostly all created at once, at application start-up.)
> 
> 
>  Instead of using the built-in TTL limit setting, I considered overriding 
>  the time to live using the available callbacks 
>  (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  Unfortunately, 
>  this approach appears to be infeasible because the parameters to those 
>  callbacks do not have access to the underlying connection object; there 
>  is no "key" that can be used to look up a connection's lifetime (i.e. in 
>  a ConcurrentHashMap) such that a decision could be made about whether to 
>  close or retain the connection.  I also took a look at the various 
>  places that I could try to override the default connection pooling 
>  behavior (e.g. MainClientExec, HttpClientConnectionManager). 
>  HttpClientConnectionManager appears to be the best bet (specifically, 
>  HttpClientConnectionManager.releaseConnection() would have to be 
>  overridden), but this would require duplication of the existing 
>  releaseConnection() code with slight modifications in the overriding 
>  class. This seems brittle.
> 
> 
>  Can anyone think of a way that I could implement this (cleanly) with 
>  HttpClient 4.x?  Maybe I missed something?
> 
>  If not, I would be happy to open a JIRA for a possible HttpClient 
>  enhancement to enable such a feature.  If people are 

Re: HttpClient with a web service: Consistent data center switch-over

2016-05-05 Thread Aaron Curley

On 5/3/2016 9:49 AM, Oleg Kalnichevski wrote:

On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote:

On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:

On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:

Hello all,

I'm using HttpClient version 4.5.1 to make high-volume calls to an internal highly-available 
web service that is hosted at multiple geographic locations.  On average, the web service 
calls need to be executed rather quickly; therefore, we are presently making heavy use of 
HttpClient's connection pooling & reuse behavior (via PoolingHttpClientConnectionManager) 
to alleviate the overhead (TCP and TLS) incurred when setting up a new connection.  Our DNS 
records direct clients of the service to the (geographically) nearest data center. Should a 
data center experience problems, we update our DNS records to redirect our clients to our 
remaining "good" site(s).

In times of an outage, HttpClient appears to have automatically failed-over to 
our alternate sites (IPs) pretty timely and consistently; however, upon service 
restoration at the primary data-center for a particular client node, HttpClient 
does not appear to reliably switch back to calling our primary site.  (In such 
instances, I have been able to confirm that our DNS records ARE switching back 
in a timely manner.  The TTLs are set to an appropriately short value.)

My best guess is that the lack of consistent "switch back" to the primary site is caused by 
HttpClient's connection pooling & reuse behavior.  Because of our relatively high (and consistent) request 
volume, creation of NEW connections is likely a rarity once a particular client node is operating for a while; 
instead, as previously noted, I would generally expect that connections in the pool be re-used whenever possible. 
 Any re-used connections will still be established with the alternate site(s), therefore, the client nodes 
communicating with alternate sites would generally never (or only VERY gradually) switch back to communicating 
with the primary site.  This lines up with what I have observed; "switching back" seems to only happen 
once request throughput drops sufficiently to allow most of our client nodes' pooled connections to "time 
out" and be closed due to inactivity (e.g. during overnight hours).


I believe a reasonably "standard" way to solve this problem would be to configure a 
maximum 'lifetime' of a connection in the connection pool (e.g. 1 hour).  This lifetime would be 
enforced regardless of whether or not the connection is idle or can otherwise be re-used.  On first 
glance, the HttpClientBuilder.setConnectionTimeToLive() method seemed ideal for this, but upon 
further investigation and review of the HttpClient code base, this method appears to configure the 
maximum TTL without introducing any element of randomness into each connection's TTL.  As a result, 
I'm concerned that if I enable the built-in TTL feature, my clients are likely to experience 
regular performance "spikes" at the configured TTL interval.  (This would be caused when 
most/all of the pooled connections expire simultaneously, since they were mostly all created at 
once, at application start-up.)


Instead of using the built-in TTL limit setting, I considered overriding the time to live 
using the available callbacks (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  
Unfortunately, this approach appears to be infeasible because the parameters to those 
callbacks do not have access to the underlying connection object; there is no 
"key" that can be used to look up a connection's lifetime (i.e. in a 
ConcurrentHashMap) such that a decision could be made about whether to close or retain 
the connection.  I also took a look at the various places that I could try to override 
the default connection pooling behavior (e.g. MainClientExec, 
HttpClientConnectionManager). HttpClientConnectionManager appears to be the best bet 
(specifically, HttpClientConnectionManager.releaseConnection() would have to be 
overridden), but this would require duplication of the existing releaseConnection() code 
with slight modifications in the overriding class. This seems brittle.


Can anyone think of a way that I could implement this (cleanly) with HttpClient 
4.x?  Maybe I missed something?

If not, I would be happy to open a JIRA for a possible HttpClient enhancement to enable such a 
feature.  If people are open to this idea, I was generally thinking that adding a more generic 
callback might be the best approach (since my use case may not be other people's use cases), but I 
could also be convinced to make the enhancement request specifically support a connection 
expiration "window" for the TTL feature instead of a specific "hard-limit".  
Any thoughts on this?


Thanks in advance (and sorry for the long email)!


Hi Aaron

PoolingHttpClientConnectionManager does not pro-actively evict expired
connections by default. I think it is unlikely that connections with a
finite TTL would all 

Re: HttpClient with a web service: Consistent data center switch-over

2016-05-03 Thread Oleg Kalnichevski
On Tue, 2016-05-03 at 09:15 -0700, Aaron Curley wrote:
> On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:
> > On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:
> >> Hello all,
> >>
> >> I'm using HttpClient version 4.5.1 to make high-volume calls to an 
> >> internal highly-available web service that is hosted at multiple 
> >> geographic locations.  On average, the web service calls need to be 
> >> executed rather quickly; therefore, we are presently making heavy use of 
> >> HttpClient's connection pooling & reuse behavior (via 
> >> PoolingHttpClientConnectionManager) to alleviate the overhead (TCP and 
> >> TLS) incurred when setting up a new connection.  Our DNS records direct 
> >> clients of the service to the (geographically) nearest data center. Should 
> >> a data center experience problems, we update our DNS records to redirect 
> >> our clients to our remaining "good" site(s).
> >>
> >> In times of an outage, HttpClient appears to have automatically 
> >> failed-over to our alternate sites (IPs) pretty timely and consistently; 
> >> however, upon service restoration at the primary data-center for a 
> >> particular client node, HttpClient does not appear to reliably switch back 
> >> to calling our primary site.  (In such instances, I have been able to 
> >> confirm that our DNS records ARE switching back in a timely manner.  The 
> >> TTLs are set to an appropriately short value.)
> >>
> >> My best guess is that the lack of consistent "switch back" to the primary 
> >> site is caused by HttpClient's connection pooling & reuse behavior.  
> >> Because of our relatively high (and consistent) request volume, creation 
> >> of NEW connections is likely a rarity once a particular client node is 
> >> operating for a while; instead, as previously noted, I would generally 
> >> expect that connections in the pool be re-used whenever possible.  Any 
> >> re-used connections will still be established with the alternate site(s), 
> >> therefore, the client nodes communicating with alternate sites would 
> >> generally never (or only VERY gradually) switch back to communicating with 
> >> the primary site.  This lines up with what I have observed; "switching 
> >> back" seems to only happen once request throughput drops sufficiently to 
> >> allow most of our client nodes' pooled connections to "time out" and be 
> >> closed due to inactivity (e.g. during overnight hours).
> >>
> >>
> >> I believe a reasonably "standard" way to solve this problem would be to 
> >> configure a maximum 'lifetime' of a connection in the connection pool 
> >> (e.g. 1 hour).  This lifetime would be enforced regardless of whether or 
> >> not the connection is idle or can otherwise be re-used.  On first glance, 
> >> the HttpClientBuilder.setConnectionTimeToLive() method seemed ideal for 
> >> this, but upon further investigation and review of the HttpClient code 
> >> base, this method appears to configure the maximum TTL without introducing 
> >> any element of randomness into each connection's TTL.  As a result, I'm 
> >> concerned that if I enable the built-in TTL feature, my clients are likely 
> >> to experience regular performance "spikes" at the configured TTL interval. 
> >>  (This would be caused when most/all of the pooled connections expire 
> >> simultaneously, since they were mostly all created at once, at application 
> >> start-up.)
> >>
> >>
> >> Instead of using the built-in TTL limit setting, I considered overriding 
> >> the time to live using the available callbacks 
> >> (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  Unfortunately, 
> >> this approach appears to be infeasible because the parameters to those 
> >> callbacks do not have access to the underlying connection object; there is 
> >> no "key" that can be used to look up a connection's lifetime (i.e. in a 
> >> ConcurrentHashMap) such that a decision could be made about whether to 
> >> close or retain the connection.  I also took a look at the various places 
> >> that I could try to override the default connection pooling behavior (e.g. 
> >> MainClientExec, HttpClientConnectionManager). HttpClientConnectionManager 
> >> appears to be the best bet (specifically, 
> >> HttpClientConnectionManager.releaseConnection() would have to be 
> >> overridden), but this would require duplication of the existing 
> >> releaseConnection() code with slight modifications in the overriding 
> >> class. This seems brittle.
> >>
> >>
> >> Can anyone think of a way that I could implement this (cleanly) with 
> >> HttpClient 4.x?  Maybe I missed something?
> >>
> >> If not, I would be happy to open a JIRA for a possible HttpClient 
> >> enhancement to enable such a feature.  If people are open to this idea, I 
> >> was generally thinking that adding a more generic callback might be the 
> >> best approach (since my use case may not be other people's use cases), but 
> >> I could also be convinced to make the enhancement request specifically 
> >> 

Re: HttpClient with a web service: Consistent data center switch-over

2016-05-03 Thread Aaron Curley

On 5/3/2016 8:22 AM, Oleg Kalnichevski wrote:

On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:

Hello all,

I'm using HttpClient version 4.5.1 to make high-volume calls to an internal highly-available 
web service that is hosted at multiple geographic locations.  On average, the web service 
calls need to be executed rather quickly; therefore, we are presently making heavy use of 
HttpClient's connection pooling & reuse behavior (via PoolingHttpClientConnectionManager) 
to alleviate the overhead (TCP and TLS) incurred when setting up a new connection.  Our DNS 
records direct clients of the service to the (geographically) nearest data center. Should a 
data center experience problems, we update our DNS records to redirect our clients to our 
remaining "good" site(s).

In times of an outage, HttpClient appears to have automatically failed-over to 
our alternate sites (IPs) pretty timely and consistently; however, upon service 
restoration at the primary data-center for a particular client node, HttpClient 
does not appear to reliably switch back to calling our primary site.  (In such 
instances, I have been able to confirm that our DNS records ARE switching back 
in a timely manner.  The TTLs are set to an appropriately short value.)

My best guess is that the lack of consistent "switch back" to the primary site is caused by 
HttpClient's connection pooling & reuse behavior.  Because of our relatively high (and consistent) request 
volume, creation of NEW connections is likely a rarity once a particular client node is operating for a while; 
instead, as previously noted, I would generally expect that connections in the pool be re-used whenever possible. 
 Any re-used connections will still be established with the alternate site(s), therefore, the client nodes 
communicating with alternate sites would generally never (or only VERY gradually) switch back to communicating 
with the primary site.  This lines up with what I have observed; "switching back" seems to only happen 
once request throughput drops sufficiently to allow most of our client nodes' pooled connections to "time 
out" and be closed due to inactivity (e.g. during overnight hours).


I believe a reasonably "standard" way to solve this problem would be to configure a 
maximum 'lifetime' of a connection in the connection pool (e.g. 1 hour).  This lifetime would be 
enforced regardless of whether or not the connection is idle or can otherwise be re-used.  On first 
glance, the HttpClientBuilder.setConnectionTimeToLive() method seemed ideal for this, but upon 
further investigation and review of the HttpClient code base, this method appears to configure the 
maximum TTL without introducing any element of randomness into each connection's TTL.  As a result, 
I'm concerned that if I enable the built-in TTL feature, my clients are likely to experience 
regular performance "spikes" at the configured TTL interval.  (This would be caused when 
most/all of the pooled connections expire simultaneously, since they were mostly all created at 
once, at application start-up.)


Instead of using the built-in TTL limit setting, I considered overriding the time to live 
using the available callbacks (ConnectionKeepAliveStrategy, ConnectionReuseStrategy).  
Unfortunately, this approach appears to be infeasible because the parameters to those 
callbacks do not have access to the underlying connection object; there is no 
"key" that can be used to look up a connection's lifetime (i.e. in a 
ConcurrentHashMap) such that a decision could be made about whether to close or retain 
the connection.  I also took a look at the various places that I could try to override 
the default connection pooling behavior (e.g. MainClientExec, 
HttpClientConnectionManager). HttpClientConnectionManager appears to be the best bet 
(specifically, HttpClientConnectionManager.releaseConnection() would have to be 
overridden), but this would require duplication of the existing releaseConnection() code 
with slight modifications in the overriding class. This seems brittle.


Can anyone think of a way that I could implement this (cleanly) with HttpClient 
4.x?  Maybe I missed something?

If not, I would be happy to open a JIRA for a possible HttpClient enhancement to enable such a 
feature.  If people are open to this idea, I was generally thinking that adding a more generic 
callback might be the best approach (since my use case may not be other people's use cases), but I 
could also be convinced to make the enhancement request specifically support a connection 
expiration "window" for the TTL feature instead of a specific "hard-limit".  
Any thoughts on this?


Thanks in advance (and sorry for the long email)!


Hi Aaron

PoolingHttpClientConnectionManager does not pro-actively evict expired
connections by default. I think it is unlikely that connections with a
finite TTL would all get evicted at the same time. Having said that you
are welcome to contribute a patch enabling TTL 

Re: HttpClient with a web service: Consistent data center switch-over

2016-05-03 Thread Oleg Kalnichevski
On Mon, 2016-05-02 at 21:28 -0700, Aaron Curley wrote:
> Hello all,
> 
> I'm using HttpClient version 4.5.1 to make high-volume calls to an internal 
> highly-available web service that is hosted at multiple geographic locations. 
>  On average, the web service calls need to be executed rather quickly; 
> therefore, we are presently making heavy use of HttpClient's connection 
> pooling & reuse behavior (via PoolingHttpClientConnectionManager) to 
> alleviate the overhead (TCP and TLS) incurred when setting up a new 
> connection.  Our DNS records direct clients of the service to the 
> (geographically) nearest data center. Should a data center experience 
> problems, we update our DNS records to redirect our clients to our remaining 
> "good" site(s).
> 
> In times of an outage, HttpClient appears to have automatically failed-over 
> to our alternate sites (IPs) pretty timely and consistently; however, upon 
> service restoration at the primary data-center for a particular client node, 
> HttpClient does not appear to reliably switch back to calling our primary 
> site.  (In such instances, I have been able to confirm that our DNS records 
> ARE switching back in a timely manner.  The TTLs are set to an appropriately 
> short value.)
> 
> My best guess is that the lack of consistent "switch back" to the primary 
> site is caused by HttpClient's connection pooling & reuse behavior.  Because 
> of our relatively high (and consistent) request volume, creation of NEW 
> connections is likely a rarity once a particular client node is operating for 
> a while; instead, as previously noted, I would generally expect that 
> connections in the pool be re-used whenever possible.  Any re-used 
> connections will still be established with the alternate site(s), therefore, 
> the client nodes communicating with alternate sites would generally never (or 
> only VERY gradually) switch back to communicating with the primary site.  
> This lines up with what I have observed; "switching back" seems to only 
> happen once request throughput drops sufficiently to allow most of our client 
> nodes' pooled connections to "time out" and be closed due to inactivity (e.g. 
> during overnight hours).
> 
> 
> I believe a reasonably "standard" way to solve this problem would be to 
> configure a maximum 'lifetime' of a connection in the connection pool (e.g. 1 
> hour).  This lifetime would be enforced regardless of whether or not the 
> connection is idle or can otherwise be re-used.  On first glance, the 
> HttpClientBuilder.setConnectionTimeToLive() method seemed ideal for this, but 
> upon further investigation and review of the HttpClient code base, this 
> method appears to configure the maximum TTL without introducing any element 
> of randomness into each connection's TTL.  As a result, I'm concerned that if 
> I enable the built-in TTL feature, my clients are likely to experience 
> regular performance "spikes" at the configured TTL interval.  (This would be 
> caused when most/all of the pooled connections expire simultaneously, since 
> they were mostly all created at once, at application start-up.)
> 
> 
> Instead of using the built-in TTL limit setting, I considered overriding the 
> time to live using the available callbacks (ConnectionKeepAliveStrategy, 
> ConnectionReuseStrategy).  Unfortunately, this approach appears to be 
> infeasible because the parameters to those callbacks do not have access to 
> the underlying connection object; there is no "key" that can be used to look 
> up a connection's lifetime (i.e. in a ConcurrentHashMap) such that a decision 
> could be made about whether to close or retain the connection.  I also took a 
> look at the various places that I could try to override the default 
> connection pooling behavior (e.g. MainClientExec, 
> HttpClientConnectionManager). HttpClientConnectionManager appears to be the 
> best bet (specifically, HttpClientConnectionManager.releaseConnection() would 
> have to be overridden), but this would require duplication of the existing 
> releaseConnection() code with slight modifications in the overriding class. 
> This seems brittle.
> 
> 
> Can anyone think of a way that I could implement this (cleanly) with 
> HttpClient 4.x?  Maybe I missed something?
> 
> If not, I would be happy to open a JIRA for a possible HttpClient enhancement 
> to enable such a feature.  If people are open to this idea, I was generally 
> thinking that adding a more generic callback might be the best approach 
> (since my use case may not be other people's use cases), but I could also be 
> convinced to make the enhancement request specifically support a connection 
> expiration "window" for the TTL feature instead of a specific "hard-limit".  
> Any thoughts on this?
> 
> 
> Thanks in advance (and sorry for the long email)!
> 

Hi Aaron

PoolingHttpClientConnectionManager does not pro-actively evict expired
connections by default. I think it is unlikely that connections with a