Re: SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-12 Thread Alexey Panchenko
Evan,

> I can do an ugly hack to look for a cause, and if it matches connection
> timeout, pull it from the immediate cause; otherwise, grab the root cause
> to handle others (there are other custom exceptions that are wrapped by
> RestClientException, hence the reason).
>
I think just using getCause() should be enough always.


> Was there a reason why the Component's implementation has connection
> exception wrapping socket exception?
>
as Oleg aleady explained "to help distinguish connect timeout from socket
(read operation) timeout"
In the first case the remote service is probably not running, and in the
second it's running but processing takes longer than client wants.


Re: SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-12 Thread Evan J
On Wed, Feb 12, 2020 at 9:33 AM Alexey Panchenko 
wrote:

> Ah, it's actually much simpler than I thought.
> ConnectTimeoutException wraps SocketTimeoutExceptions
> but you use ExceptionUtils.getRootCause which retrieves the inner most
> exception - which is SocketTimeoutException
>
> so
> 1. logging or posting a full stackrace would make the reason obvious from
> the beginning
> 2.getRootCause() is not always the right thing to do
>
>
> On Wed, Feb 12, 2020 at 5:22 AM Evan J 
> wrote:
>
> > On Tue, Feb 11, 2020 at 3:08 PM Alexey Panchenko <
> alex.panche...@gmail.com
> > >
> > wrote:
> >
> > > I guess, It can depends on which level load balancer operates.
> > > if load balancer is layer 4 (TCP) then for the operating system it
> looks
> > > like connection is made directly to the target application. In this
> case
> > > exception should happen during connect and be wrapped as
> > > ConnectTimeoutException
> > > but if load balancer is layer 7 (HTTP) then TCP connection is made to
> the
> > > load balancer (which possibly reads the whole request into memory) and
> > then
> > > it's trying to establish another TCP connection to the target
> application
> > > and forward request there.
> > > While that happens, the client thinks that it already has sent the
> > request
> > > to the target application and is already waiting for a response.
> > > In this situation SocketTimeoutException can happen.
> > >
> > > On Tue, Feb 11, 2020 at 6:23 PM Evan J 
> > > wrote:
> > >
> > > > Thank you for your response.
> > > >
> > > > I'm trying to understand your statement. Are you saying when
> connection
> > > > times out, SocketTimeoutException is thrown, and then the library
> > > captures
> > > > it and rethrows ConnectTimeoutException?
> > > >
> > > > Bottom line, first, is what I explained the expected behavior, and
> > > second,
> > > > how can we properly distinguish between these two types of timeouts?
> > > >
> > > > On Tue, Feb 11, 2020, 4:24 AM Oleg Kalnichevski 
> > > wrote:
> > > >
> > > > > On Mon, 2020-02-10 at 23:25 -0500, Evan J wrote:
> > > > > >  Hi,
> > > > > >
> > > > > > (looks like I'd sent this to a wrong user group originally)
> > > > > >
> > > > > > We deploy an application B (which is basically a backend
> > application
> > > > > > serving a web application) to a cluster of application servers
> > (JBoss
> > > > > > EAP
> > > > > > 7.2 -- 8 instances). These instances send HTTP requests to a set
> of
> > > > > > gateway
> > > > > > applications, call them application Gs (10 instances), where the
> > > > > > requests
> > > > > > go through an F5 loadbalancer that sits between them.
> > > > > >
> > > > > > Instances of application B are Spring Boot applications (2.1.x)
> > that
> > > > > > have
> > > > > > been configured with Apache HttpClient 4.5.5 and HttpCore 4.4.9.
> > The
> > > > > > configuration is almost identical to this github code:
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/tree/master/resttemplate/src/main/java/guru/springframework/resttemplate/config
> > > > > >
> > > > > > The only exception is that RestTemplateConfig#restTemplate() is
> > > > > > configured
> > > > > > by creation of RestTemplate and passing
> > > > > > HttpComponentsClientHttpRequestFactory to its constructor rather
> > than
> > > > > > using
> > > > > > Spring Boot's RestTemplateBuilder().
> > > > > >
> > > > > > The keep alive configuration in application B is basically
> defined
> > > > > > something like:
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/blob/0152fb0c4acf08d019128ca38c3dd2523871c43c/resttemplate/src/main/java/guru/springframework/resttemplate/config/ApacheHttpClientConfig.java#L52
> > > > > >
> > > > > > When a load test is executed in a single stack environment where
> > the
> > > > > > load
> > > > > > balancer is omitted (one application B -> one application G), the
> > > > > > requests
> > > > > > and responses are processed and validated accordingly (each
> request
> > > > > > should
> > > > > > correspond to a right response and be sent to the web application
> > > > > > front
> > > > > > end).
> > > > > >
> > > > > > However, when we run the same load test in a distributed
> > environment
> > > > > > with a
> > > > > > load balancer between application B's and application G's, we
> see a
> > > > > > lot of
> > > > > > SocketTimeoutExceptions being logged, and we notice them very
> > quickly
> > > > > > (about 5% of total of responses in application B throw that
> > > > > > exception).
> > > > > >
> > > > > > The code structure is very straightforward:
> > > > > >
> > > > > > try {
> > > > > >   // RestTemplate call
> > > > > > } catch (RestClientException exception) {
> > > > > >   Exception rootCause = ExceptionUtils.getRootCause(exception);
> //
> > > > > > Apache
> > > > > > Commons lib
> > > > > >   if (rootCause != null) {
> > > > > >   

Re: SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-12 Thread Alexey Panchenko
Ah, it's actually much simpler than I thought.
ConnectTimeoutException wraps SocketTimeoutExceptions
but you use ExceptionUtils.getRootCause which retrieves the inner most
exception - which is SocketTimeoutException

so
1. logging or posting a full stackrace would make the reason obvious from
the beginning
2.getRootCause() is not always the right thing to do


On Wed, Feb 12, 2020 at 5:22 AM Evan J  wrote:

> On Tue, Feb 11, 2020 at 3:08 PM Alexey Panchenko  >
> wrote:
>
> > I guess, It can depends on which level load balancer operates.
> > if load balancer is layer 4 (TCP) then for the operating system it looks
> > like connection is made directly to the target application. In this case
> > exception should happen during connect and be wrapped as
> > ConnectTimeoutException
> > but if load balancer is layer 7 (HTTP) then TCP connection is made to the
> > load balancer (which possibly reads the whole request into memory) and
> then
> > it's trying to establish another TCP connection to the target application
> > and forward request there.
> > While that happens, the client thinks that it already has sent the
> request
> > to the target application and is already waiting for a response.
> > In this situation SocketTimeoutException can happen.
> >
> > On Tue, Feb 11, 2020 at 6:23 PM Evan J 
> > wrote:
> >
> > > Thank you for your response.
> > >
> > > I'm trying to understand your statement. Are you saying when connection
> > > times out, SocketTimeoutException is thrown, and then the library
> > captures
> > > it and rethrows ConnectTimeoutException?
> > >
> > > Bottom line, first, is what I explained the expected behavior, and
> > second,
> > > how can we properly distinguish between these two types of timeouts?
> > >
> > > On Tue, Feb 11, 2020, 4:24 AM Oleg Kalnichevski 
> > wrote:
> > >
> > > > On Mon, 2020-02-10 at 23:25 -0500, Evan J wrote:
> > > > >  Hi,
> > > > >
> > > > > (looks like I'd sent this to a wrong user group originally)
> > > > >
> > > > > We deploy an application B (which is basically a backend
> application
> > > > > serving a web application) to a cluster of application servers
> (JBoss
> > > > > EAP
> > > > > 7.2 -- 8 instances). These instances send HTTP requests to a set of
> > > > > gateway
> > > > > applications, call them application Gs (10 instances), where the
> > > > > requests
> > > > > go through an F5 loadbalancer that sits between them.
> > > > >
> > > > > Instances of application B are Spring Boot applications (2.1.x)
> that
> > > > > have
> > > > > been configured with Apache HttpClient 4.5.5 and HttpCore 4.4.9.
> The
> > > > > configuration is almost identical to this github code:
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/tree/master/resttemplate/src/main/java/guru/springframework/resttemplate/config
> > > > >
> > > > > The only exception is that RestTemplateConfig#restTemplate() is
> > > > > configured
> > > > > by creation of RestTemplate and passing
> > > > > HttpComponentsClientHttpRequestFactory to its constructor rather
> than
> > > > > using
> > > > > Spring Boot's RestTemplateBuilder().
> > > > >
> > > > > The keep alive configuration in application B is basically defined
> > > > > something like:
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/blob/0152fb0c4acf08d019128ca38c3dd2523871c43c/resttemplate/src/main/java/guru/springframework/resttemplate/config/ApacheHttpClientConfig.java#L52
> > > > >
> > > > > When a load test is executed in a single stack environment where
> the
> > > > > load
> > > > > balancer is omitted (one application B -> one application G), the
> > > > > requests
> > > > > and responses are processed and validated accordingly (each request
> > > > > should
> > > > > correspond to a right response and be sent to the web application
> > > > > front
> > > > > end).
> > > > >
> > > > > However, when we run the same load test in a distributed
> environment
> > > > > with a
> > > > > load balancer between application B's and application G's, we see a
> > > > > lot of
> > > > > SocketTimeoutExceptions being logged, and we notice them very
> quickly
> > > > > (about 5% of total of responses in application B throw that
> > > > > exception).
> > > > >
> > > > > The code structure is very straightforward:
> > > > >
> > > > > try {
> > > > >   // RestTemplate call
> > > > > } catch (RestClientException exception) {
> > > > >   Exception rootCause = ExceptionUtils.getRootCause(exception); //
> > > > > Apache
> > > > > Commons lib
> > > > >   if (rootCause != null) {
> > > > > if
> > > > >
> (SocketTimeoutException.getClass().getName().equals(rootCause.getClas
> > > > > s().getName())
> > > > > {
> > > > > // or even if (rootCause instanceof SocketTimeoutException) {
> > > > >   // Log for socket timeout
> > > > > }
> > > > > if
> > > > >
> (ConnectTimeoutException.getClass().getName().equals(rootCause.getCla
> > > > 

Re: SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-11 Thread Evan J
On Tue, Feb 11, 2020 at 3:08 PM Alexey Panchenko 
wrote:

> I guess, It can depends on which level load balancer operates.
> if load balancer is layer 4 (TCP) then for the operating system it looks
> like connection is made directly to the target application. In this case
> exception should happen during connect and be wrapped as
> ConnectTimeoutException
> but if load balancer is layer 7 (HTTP) then TCP connection is made to the
> load balancer (which possibly reads the whole request into memory) and then
> it's trying to establish another TCP connection to the target application
> and forward request there.
> While that happens, the client thinks that it already has sent the request
> to the target application and is already waiting for a response.
> In this situation SocketTimeoutException can happen.
>
> On Tue, Feb 11, 2020 at 6:23 PM Evan J 
> wrote:
>
> > Thank you for your response.
> >
> > I'm trying to understand your statement. Are you saying when connection
> > times out, SocketTimeoutException is thrown, and then the library
> captures
> > it and rethrows ConnectTimeoutException?
> >
> > Bottom line, first, is what I explained the expected behavior, and
> second,
> > how can we properly distinguish between these two types of timeouts?
> >
> > On Tue, Feb 11, 2020, 4:24 AM Oleg Kalnichevski 
> wrote:
> >
> > > On Mon, 2020-02-10 at 23:25 -0500, Evan J wrote:
> > > >  Hi,
> > > >
> > > > (looks like I'd sent this to a wrong user group originally)
> > > >
> > > > We deploy an application B (which is basically a backend application
> > > > serving a web application) to a cluster of application servers (JBoss
> > > > EAP
> > > > 7.2 -- 8 instances). These instances send HTTP requests to a set of
> > > > gateway
> > > > applications, call them application Gs (10 instances), where the
> > > > requests
> > > > go through an F5 loadbalancer that sits between them.
> > > >
> > > > Instances of application B are Spring Boot applications (2.1.x) that
> > > > have
> > > > been configured with Apache HttpClient 4.5.5 and HttpCore 4.4.9. The
> > > > configuration is almost identical to this github code:
> > > >
> > > >
> > >
> > >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/tree/master/resttemplate/src/main/java/guru/springframework/resttemplate/config
> > > >
> > > > The only exception is that RestTemplateConfig#restTemplate() is
> > > > configured
> > > > by creation of RestTemplate and passing
> > > > HttpComponentsClientHttpRequestFactory to its constructor rather than
> > > > using
> > > > Spring Boot's RestTemplateBuilder().
> > > >
> > > > The keep alive configuration in application B is basically defined
> > > > something like:
> > > >
> > > >
> > >
> > >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/blob/0152fb0c4acf08d019128ca38c3dd2523871c43c/resttemplate/src/main/java/guru/springframework/resttemplate/config/ApacheHttpClientConfig.java#L52
> > > >
> > > > When a load test is executed in a single stack environment where the
> > > > load
> > > > balancer is omitted (one application B -> one application G), the
> > > > requests
> > > > and responses are processed and validated accordingly (each request
> > > > should
> > > > correspond to a right response and be sent to the web application
> > > > front
> > > > end).
> > > >
> > > > However, when we run the same load test in a distributed environment
> > > > with a
> > > > load balancer between application B's and application G's, we see a
> > > > lot of
> > > > SocketTimeoutExceptions being logged, and we notice them very quickly
> > > > (about 5% of total of responses in application B throw that
> > > > exception).
> > > >
> > > > The code structure is very straightforward:
> > > >
> > > > try {
> > > >   // RestTemplate call
> > > > } catch (RestClientException exception) {
> > > >   Exception rootCause = ExceptionUtils.getRootCause(exception); //
> > > > Apache
> > > > Commons lib
> > > >   if (rootCause != null) {
> > > > if
> > > > (SocketTimeoutException.getClass().getName().equals(rootCause.getClas
> > > > s().getName())
> > > > {
> > > > // or even if (rootCause instanceof SocketTimeoutException) {
> > > >   // Log for socket timeout
> > > > }
> > > > if
> > > > (ConnectTimeoutException.getClass().getName().equals(rootCause.getCla
> > > > ss().getName())
> > > > {
> > > > // or even if (rootCause instanceof  ConnectTimeoutException ) {
> > > >   // Log for connection timeout
> > > > }
> > > >   }
> > > > }
> > > >
> > > > Application B's keep alive has been set to 20 seconds while the
> > > > socket
> > > > timeout has been set to 10 seconds by default (connection timeout to
> > > > 1
> > > > second).
> > > >
> > > > After placing timer to log how long it takes for an exception is
> > > > thrown, we
> > > > saw, the time that it took from the moment RestTemplate is invoked
> > > > till the
> > > > exception is thrown was slightly above 1 second, e.g. 1030 ms, 1045
> > > > 

Re: SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-11 Thread Alexey Panchenko
I guess, It can depends on which level load balancer operates.
if load balancer is layer 4 (TCP) then for the operating system it looks
like connection is made directly to the target application. In this case
exception should happen during connect and be wrapped as
ConnectTimeoutException
but if load balancer is layer 7 (HTTP) then TCP connection is made to the
load balancer (which possibly reads the whole request into memory) and then
it's trying to establish another TCP connection to the target application
and forward request there.
While that happens, the client thinks that it already has sent the request
to the target application and is already waiting for a response.
In this situation SocketTimeoutException can happen.

On Tue, Feb 11, 2020 at 6:23 PM Evan J  wrote:

> Thank you for your response.
>
> I'm trying to understand your statement. Are you saying when connection
> times out, SocketTimeoutException is thrown, and then the library captures
> it and rethrows ConnectTimeoutException?
>
> Bottom line, first, is what I explained the expected behavior, and second,
> how can we properly distinguish between these two types of timeouts?
>
> On Tue, Feb 11, 2020, 4:24 AM Oleg Kalnichevski  wrote:
>
> > On Mon, 2020-02-10 at 23:25 -0500, Evan J wrote:
> > >  Hi,
> > >
> > > (looks like I'd sent this to a wrong user group originally)
> > >
> > > We deploy an application B (which is basically a backend application
> > > serving a web application) to a cluster of application servers (JBoss
> > > EAP
> > > 7.2 -- 8 instances). These instances send HTTP requests to a set of
> > > gateway
> > > applications, call them application Gs (10 instances), where the
> > > requests
> > > go through an F5 loadbalancer that sits between them.
> > >
> > > Instances of application B are Spring Boot applications (2.1.x) that
> > > have
> > > been configured with Apache HttpClient 4.5.5 and HttpCore 4.4.9. The
> > > configuration is almost identical to this github code:
> > >
> > >
> >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/tree/master/resttemplate/src/main/java/guru/springframework/resttemplate/config
> > >
> > > The only exception is that RestTemplateConfig#restTemplate() is
> > > configured
> > > by creation of RestTemplate and passing
> > > HttpComponentsClientHttpRequestFactory to its constructor rather than
> > > using
> > > Spring Boot's RestTemplateBuilder().
> > >
> > > The keep alive configuration in application B is basically defined
> > > something like:
> > >
> > >
> >
> >
> https://github.com/spring-framework-guru/sfg-blog-posts/blob/0152fb0c4acf08d019128ca38c3dd2523871c43c/resttemplate/src/main/java/guru/springframework/resttemplate/config/ApacheHttpClientConfig.java#L52
> > >
> > > When a load test is executed in a single stack environment where the
> > > load
> > > balancer is omitted (one application B -> one application G), the
> > > requests
> > > and responses are processed and validated accordingly (each request
> > > should
> > > correspond to a right response and be sent to the web application
> > > front
> > > end).
> > >
> > > However, when we run the same load test in a distributed environment
> > > with a
> > > load balancer between application B's and application G's, we see a
> > > lot of
> > > SocketTimeoutExceptions being logged, and we notice them very quickly
> > > (about 5% of total of responses in application B throw that
> > > exception).
> > >
> > > The code structure is very straightforward:
> > >
> > > try {
> > >   // RestTemplate call
> > > } catch (RestClientException exception) {
> > >   Exception rootCause = ExceptionUtils.getRootCause(exception); //
> > > Apache
> > > Commons lib
> > >   if (rootCause != null) {
> > > if
> > > (SocketTimeoutException.getClass().getName().equals(rootCause.getClas
> > > s().getName())
> > > {
> > > // or even if (rootCause instanceof SocketTimeoutException) {
> > >   // Log for socket timeout
> > > }
> > > if
> > > (ConnectTimeoutException.getClass().getName().equals(rootCause.getCla
> > > ss().getName())
> > > {
> > > // or even if (rootCause instanceof  ConnectTimeoutException ) {
> > >   // Log for connection timeout
> > > }
> > >   }
> > > }
> > >
> > > Application B's keep alive has been set to 20 seconds while the
> > > socket
> > > timeout has been set to 10 seconds by default (connection timeout to
> > > 1
> > > second).
> > >
> > > After placing timer to log how long it takes for an exception is
> > > thrown, we
> > > saw, the time that it took from the moment RestTemplate is invoked
> > > till the
> > > exception is thrown was slightly above 1 second, e.g. 1030 ms, 1045
> > > ms,
> > > 1020 ms, etc.. This led us to increase the connection timeouts from 1
> > > second to 2 seconds, and afterward, we didn't get any timeout
> > > exception of
> > > any sort under the similar load.
> > >
> > > My question is, why is that the majority of exceptions that are being
> > > thrown have 

Re: SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-11 Thread Evan J
Thank you for your response.

I'm trying to understand your statement. Are you saying when connection
times out, SocketTimeoutException is thrown, and then the library captures
it and rethrows ConnectTimeoutException?

Bottom line, first, is what I explained the expected behavior, and second,
how can we properly distinguish between these two types of timeouts?

On Tue, Feb 11, 2020, 4:24 AM Oleg Kalnichevski  wrote:

> On Mon, 2020-02-10 at 23:25 -0500, Evan J wrote:
> >  Hi,
> >
> > (looks like I'd sent this to a wrong user group originally)
> >
> > We deploy an application B (which is basically a backend application
> > serving a web application) to a cluster of application servers (JBoss
> > EAP
> > 7.2 -- 8 instances). These instances send HTTP requests to a set of
> > gateway
> > applications, call them application Gs (10 instances), where the
> > requests
> > go through an F5 loadbalancer that sits between them.
> >
> > Instances of application B are Spring Boot applications (2.1.x) that
> > have
> > been configured with Apache HttpClient 4.5.5 and HttpCore 4.4.9. The
> > configuration is almost identical to this github code:
> >
> >
>
> https://github.com/spring-framework-guru/sfg-blog-posts/tree/master/resttemplate/src/main/java/guru/springframework/resttemplate/config
> >
> > The only exception is that RestTemplateConfig#restTemplate() is
> > configured
> > by creation of RestTemplate and passing
> > HttpComponentsClientHttpRequestFactory to its constructor rather than
> > using
> > Spring Boot's RestTemplateBuilder().
> >
> > The keep alive configuration in application B is basically defined
> > something like:
> >
> >
>
> https://github.com/spring-framework-guru/sfg-blog-posts/blob/0152fb0c4acf08d019128ca38c3dd2523871c43c/resttemplate/src/main/java/guru/springframework/resttemplate/config/ApacheHttpClientConfig.java#L52
> >
> > When a load test is executed in a single stack environment where the
> > load
> > balancer is omitted (one application B -> one application G), the
> > requests
> > and responses are processed and validated accordingly (each request
> > should
> > correspond to a right response and be sent to the web application
> > front
> > end).
> >
> > However, when we run the same load test in a distributed environment
> > with a
> > load balancer between application B's and application G's, we see a
> > lot of
> > SocketTimeoutExceptions being logged, and we notice them very quickly
> > (about 5% of total of responses in application B throw that
> > exception).
> >
> > The code structure is very straightforward:
> >
> > try {
> >   // RestTemplate call
> > } catch (RestClientException exception) {
> >   Exception rootCause = ExceptionUtils.getRootCause(exception); //
> > Apache
> > Commons lib
> >   if (rootCause != null) {
> > if
> > (SocketTimeoutException.getClass().getName().equals(rootCause.getClas
> > s().getName())
> > {
> > // or even if (rootCause instanceof SocketTimeoutException) {
> >   // Log for socket timeout
> > }
> > if
> > (ConnectTimeoutException.getClass().getName().equals(rootCause.getCla
> > ss().getName())
> > {
> > // or even if (rootCause instanceof  ConnectTimeoutException ) {
> >   // Log for connection timeout
> > }
> >   }
> > }
> >
> > Application B's keep alive has been set to 20 seconds while the
> > socket
> > timeout has been set to 10 seconds by default (connection timeout to
> > 1
> > second).
> >
> > After placing timer to log how long it takes for an exception is
> > thrown, we
> > saw, the time that it took from the moment RestTemplate is invoked
> > till the
> > exception is thrown was slightly above 1 second, e.g. 1030 ms, 1045
> > ms,
> > 1020 ms, etc.. This led us to increase the connection timeouts from 1
> > second to 2 seconds, and afterward, we didn't get any timeout
> > exception of
> > any sort under the similar load.
> >
> > My question is, why is that the majority of exceptions that are being
> > thrown have SocketTimeoutExceptions type as opposed to
> > ConnectTimeoutExceptions which, based on the timeout adjustment
> > mentioned
> > above, appears to be the latter (Connect) vs. socket (read) timeout?
> > Note
> > that I said the majority of time, as I've seen a few
> > ConnectTimeoutExceptions as well, but almost 99% of the failed ones
> > are
> > SocketTimeoutExceptions.
> >
> > Also, in our logs, we log the "rootCause's" class name to avoid
> > ambiguity,
> > but as I mentioned, they are being logged as SocketTimeoutExceptions
> > class
> > names.
> >
> > What is Apache Components library doing under the hood that signals
> > the
> > underlying JDK code to throw SocketTimeoutExceptions rather than
> > ConnectTimeoutException
>
> Connection management code in HttpClient re-throws
> SocketTimeoutExceptions thrown while connecting a socket to its remote
> endpoint as ConnectTimeoutException to help distinguish connect timeout
> from socket (read operation) timeout. SocketTimeoutExceptions are
> 

Re: SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-11 Thread Oleg Kalnichevski
On Mon, 2020-02-10 at 23:25 -0500, Evan J wrote:
>  Hi,
> 
> (looks like I'd sent this to a wrong user group originally)
> 
> We deploy an application B (which is basically a backend application
> serving a web application) to a cluster of application servers (JBoss
> EAP
> 7.2 -- 8 instances). These instances send HTTP requests to a set of
> gateway
> applications, call them application Gs (10 instances), where the
> requests
> go through an F5 loadbalancer that sits between them.
> 
> Instances of application B are Spring Boot applications (2.1.x) that
> have
> been configured with Apache HttpClient 4.5.5 and HttpCore 4.4.9. The
> configuration is almost identical to this github code:
> 
> 
https://github.com/spring-framework-guru/sfg-blog-posts/tree/master/resttemplate/src/main/java/guru/springframework/resttemplate/config
> 
> The only exception is that RestTemplateConfig#restTemplate() is
> configured
> by creation of RestTemplate and passing
> HttpComponentsClientHttpRequestFactory to its constructor rather than
> using
> Spring Boot's RestTemplateBuilder().
> 
> The keep alive configuration in application B is basically defined
> something like:
> 
> 
https://github.com/spring-framework-guru/sfg-blog-posts/blob/0152fb0c4acf08d019128ca38c3dd2523871c43c/resttemplate/src/main/java/guru/springframework/resttemplate/config/ApacheHttpClientConfig.java#L52
> 
> When a load test is executed in a single stack environment where the
> load
> balancer is omitted (one application B -> one application G), the
> requests
> and responses are processed and validated accordingly (each request
> should
> correspond to a right response and be sent to the web application
> front
> end).
> 
> However, when we run the same load test in a distributed environment
> with a
> load balancer between application B's and application G's, we see a
> lot of
> SocketTimeoutExceptions being logged, and we notice them very quickly
> (about 5% of total of responses in application B throw that
> exception).
> 
> The code structure is very straightforward:
> 
> try {
>   // RestTemplate call
> } catch (RestClientException exception) {
>   Exception rootCause = ExceptionUtils.getRootCause(exception); //
> Apache
> Commons lib
>   if (rootCause != null) {
> if
> (SocketTimeoutException.getClass().getName().equals(rootCause.getClas
> s().getName())
> {
> // or even if (rootCause instanceof SocketTimeoutException) {
>   // Log for socket timeout
> }
> if
> (ConnectTimeoutException.getClass().getName().equals(rootCause.getCla
> ss().getName())
> {
> // or even if (rootCause instanceof  ConnectTimeoutException ) {
>   // Log for connection timeout
> }
>   }
> }
> 
> Application B's keep alive has been set to 20 seconds while the
> socket
> timeout has been set to 10 seconds by default (connection timeout to
> 1
> second).
> 
> After placing timer to log how long it takes for an exception is
> thrown, we
> saw, the time that it took from the moment RestTemplate is invoked
> till the
> exception is thrown was slightly above 1 second, e.g. 1030 ms, 1045
> ms,
> 1020 ms, etc.. This led us to increase the connection timeouts from 1
> second to 2 seconds, and afterward, we didn't get any timeout
> exception of
> any sort under the similar load.
> 
> My question is, why is that the majority of exceptions that are being
> thrown have SocketTimeoutExceptions type as opposed to
> ConnectTimeoutExceptions which, based on the timeout adjustment
> mentioned
> above, appears to be the latter (Connect) vs. socket (read) timeout?
> Note
> that I said the majority of time, as I've seen a few
> ConnectTimeoutExceptions as well, but almost 99% of the failed ones
> are
> SocketTimeoutExceptions.
> 
> Also, in our logs, we log the "rootCause's" class name to avoid
> ambiguity,
> but as I mentioned, they are being logged as SocketTimeoutExceptions
> class
> names.
> 
> What is Apache Components library doing under the hood that signals
> the
> underlying JDK code to throw SocketTimeoutExceptions rather than
> ConnectTimeoutException

Connection management code in HttpClient re-throws
SocketTimeoutExceptions thrown while connecting a socket to its remote
endpoint as ConnectTimeoutException to help distinguish connect timeout
from socket (read operation) timeout. SocketTimeoutExceptions are
always thrown by the JRE, not by HttpClient.

Oleg



-
To unsubscribe, e-mail: httpclient-users-unsubscr...@hc.apache.org
For additional commands, e-mail: httpclient-users-h...@hc.apache.org



SocketTimeoutExceptions When Communicating with a Load Balancer

2020-02-10 Thread Evan J
 Hi,

(looks like I'd sent this to a wrong user group originally)

We deploy an application B (which is basically a backend application
serving a web application) to a cluster of application servers (JBoss EAP
7.2 -- 8 instances). These instances send HTTP requests to a set of gateway
applications, call them application Gs (10 instances), where the requests
go through an F5 loadbalancer that sits between them.

Instances of application B are Spring Boot applications (2.1.x) that have
been configured with Apache HttpClient 4.5.5 and HttpCore 4.4.9. The
configuration is almost identical to this github code:

https://github.com/spring-framework-guru/sfg-blog-posts/tree/master/resttemplate/src/main/java/guru/springframework/resttemplate/config

The only exception is that RestTemplateConfig#restTemplate() is configured
by creation of RestTemplate and passing
HttpComponentsClientHttpRequestFactory to its constructor rather than using
Spring Boot's RestTemplateBuilder().

The keep alive configuration in application B is basically defined
something like:

https://github.com/spring-framework-guru/sfg-blog-posts/blob/0152fb0c4acf08d019128ca38c3dd2523871c43c/resttemplate/src/main/java/guru/springframework/resttemplate/config/ApacheHttpClientConfig.java#L52

When a load test is executed in a single stack environment where the load
balancer is omitted (one application B -> one application G), the requests
and responses are processed and validated accordingly (each request should
correspond to a right response and be sent to the web application front
end).

However, when we run the same load test in a distributed environment with a
load balancer between application B's and application G's, we see a lot of
SocketTimeoutExceptions being logged, and we notice them very quickly
(about 5% of total of responses in application B throw that exception).

The code structure is very straightforward:

try {
  // RestTemplate call
} catch (RestClientException exception) {
  Exception rootCause = ExceptionUtils.getRootCause(exception); // Apache
Commons lib
  if (rootCause != null) {
if
(SocketTimeoutException.getClass().getName().equals(rootCause.getClass().getName())
{
// or even if (rootCause instanceof SocketTimeoutException) {
  // Log for socket timeout
}
if
(ConnectTimeoutException.getClass().getName().equals(rootCause.getClass().getName())
{
// or even if (rootCause instanceof  ConnectTimeoutException ) {
  // Log for connection timeout
}
  }
}

Application B's keep alive has been set to 20 seconds while the socket
timeout has been set to 10 seconds by default (connection timeout to 1
second).

After placing timer to log how long it takes for an exception is thrown, we
saw, the time that it took from the moment RestTemplate is invoked till the
exception is thrown was slightly above 1 second, e.g. 1030 ms, 1045 ms,
1020 ms, etc.. This led us to increase the connection timeouts from 1
second to 2 seconds, and afterward, we didn't get any timeout exception of
any sort under the similar load.

My question is, why is that the majority of exceptions that are being
thrown have SocketTimeoutExceptions type as opposed to
ConnectTimeoutExceptions which, based on the timeout adjustment mentioned
above, appears to be the latter (Connect) vs. socket (read) timeout? Note
that I said the majority of time, as I've seen a few
ConnectTimeoutExceptions as well, but almost 99% of the failed ones are
SocketTimeoutExceptions.

Also, in our logs, we log the "rootCause's" class name to avoid ambiguity,
but as I mentioned, they are being logged as SocketTimeoutExceptions class
names.

What is Apache Components library doing under the hood that signals the
underlying JDK code to throw SocketTimeoutExceptions rather than
ConnectTimeoutException?