Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Christopher Schultz

Ayub,

On 11/12/20 11:20, Ayub Khan wrote:

Chris,

That's correct, it's just a plain static hello world page I created to
verify tomcat. It is served by tomcat. I have bundled this page in the same
context where the service is running. When I create load on the service and
then try to access the static hello world page browser keeps busy and does
not return the page.

I checked the database dashboard and the monitoring charts are normal, no
spikes on cpu or any other resources of the database. The delay is
noticeable when there are more than 1000 concurrent requests from each of 4
different JMeter test instances


That's 4000 concurrent requests. Your  only has 2000 threads, 
so only 2000 requests can be processed simultaneously.


You have a keepalive timeout of 6 seconds (6000ms) and I'm guessing your 
load test doesn't actually use KeepAlive.



Why does tomcat not even serve the html page


I think the keepalive timeout explains what you are seeing.

Are you instructing JMeter to re-use connections and also use KeepAlive?

What happens if you set the KeepAlive timeout to 1 second instead of 6? 
Does that improve things?


-chris


On Thu, Nov 12, 2020 at 7:01 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:


Ayub,

On 11/12/20 10:47, Ayub Khan wrote:

Chris,

I am using hikaricp connection pooling and the maximum pool size is set

to

100, without specifying minimum idle connections. Even during high load I
see there are more than 80 connections in idle state.

I have setup debug statements to print the total time taken to complete

the

request. The response time of completed call during load is around 5
seconds, the response time without load is around 400 to 500 milliseconds


That's a significant difference. Is your database server showing high
CPU usage or more I/O usage during those high-load times?


During the load I cannot even access static html page


Now *that* is an interesting data point.

You are sure that the "static" request doesn't hit any other resources?
No filter is doing anything? No logging to an external service or
double-checking any security constraints in the db before serving the page?

(And the static page is being returned by Tomcat, not nginx, right?)

-chris


On Thu, Nov 12, 2020 at 4:59 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:


Ayub,

On 11/11/20 16:16, Ayub Khan wrote:

I was load testing using the ec2 load balancer dns. I have increased

the

connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I

am

not seeing connection timeout in nginx logs now. No errors in

kernel.log

I

am not seeing any errors in tomcat catalina.out.


The timeouts are most likely related to the connection timeout (and
therefore keepalive) setting. If you are proxying connections from nginx
and they should be staying open, you should really never be experiencing
a timeout between nginx and Tomcat.


During regular operations when the request count is between 4 to 6k
requests per minute the open files count for the tomcat process is

between

200 to 350. Responses from tomcat are within 5 seconds.


Good.


If the requests count goes beyond 6.5 k open files slowly move up  to

2300

to 3000 and the request responses from tomcat become slow.


This is pretty important, here. You are measuring two things:

1. Rise in file descriptor count
2. Application slowness

You are assuming that #1 is causing #2. It's entirely possible that #2
is causing #1.

The real question is "why is the application slowing down". Do you see
CPU spikes? If not, check your db connections.

If your db connection pool is fully-utilized (no more available), then
you may have lots of request processing threads sitting there waiting on
db connections. You'd see a rise in incoming connections (waiting) which
aren't making any progress, and the application seems to "slow down",
and there is a snowball effect where more requests means more waiting,
and therefore more slowness. This would manifest as sloe response times
without any CPU spike.

You could also have a slow database and/or some other resource such as a
downstream web service.

I would investigate those options before trying to prove that fds don't
scale on JVM or Linux (because they likely DO scale quite well).


I am not concerned about high open files as I do not see any errors

related

to open files. Only side effect of  open files going above 700 is the
response from tomcat is slow. I checked if this is caused from elastic
search, aws cloud watch shows elastic search response is within 5
milliseconds.

what might be the reason that when the open files goes beyond 600, it

slows

down the response time for tomcat. I tried with tomcat 9 and it's the

same

behavior


You might want to add some debug logging to your application when
getting ready to contact e.g. a database or remote service. Something

like:


[timestamp] [thread-id] DEBUG Making call to X
[timestamp] [thread-id] DEBUG Completed call to X

or


Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Ayub Khan
Chris,

That's correct, it's just a plain static hello world page I created to
verify tomcat. It is served by tomcat. I have bundled this page in the same
context where the service is running. When I create load on the service and
then try to access the static hello world page browser keeps busy and does
not return the page.

I checked the database dashboard and the monitoring charts are normal, no
spikes on cpu or any other resources of the database. The delay is
noticeable when there are more than 1000 concurrent requests from each of 4
different JMeter test instances

Why does tomcat not even serve the html page




On Thu, Nov 12, 2020 at 7:01 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:

> Ayub,
>
> On 11/12/20 10:47, Ayub Khan wrote:
> > Chris,
> >
> > I am using hikaricp connection pooling and the maximum pool size is set
> to
> > 100, without specifying minimum idle connections. Even during high load I
> > see there are more than 80 connections in idle state.
> >
> > I have setup debug statements to print the total time taken to complete
> the
> > request. The response time of completed call during load is around 5
> > seconds, the response time without load is around 400 to 500 milliseconds
>
> That's a significant difference. Is your database server showing high
> CPU usage or more I/O usage during those high-load times?
>
> > During the load I cannot even access static html page
>
> Now *that* is an interesting data point.
>
> You are sure that the "static" request doesn't hit any other resources?
> No filter is doing anything? No logging to an external service or
> double-checking any security constraints in the db before serving the page?
>
> (And the static page is being returned by Tomcat, not nginx, right?)
>
> -chris
>
> > On Thu, Nov 12, 2020 at 4:59 PM Christopher Schultz <
> > ch...@christopherschultz.net> wrote:
> >
> >> Ayub,
> >>
> >> On 11/11/20 16:16, Ayub Khan wrote:
> >>> I was load testing using the ec2 load balancer dns. I have increased
> the
> >>> connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I
> am
> >>> not seeing connection timeout in nginx logs now. No errors in
> kernel.log
> >> I
> >>> am not seeing any errors in tomcat catalina.out.
> >>
> >> The timeouts are most likely related to the connection timeout (and
> >> therefore keepalive) setting. If you are proxying connections from nginx
> >> and they should be staying open, you should really never be experiencing
> >> a timeout between nginx and Tomcat.
> >>
> >>> During regular operations when the request count is between 4 to 6k
> >>> requests per minute the open files count for the tomcat process is
> >> between
> >>> 200 to 350. Responses from tomcat are within 5 seconds.
> >>
> >> Good.
> >>
> >>> If the requests count goes beyond 6.5 k open files slowly move up  to
> >> 2300
> >>> to 3000 and the request responses from tomcat become slow.
> >>
> >> This is pretty important, here. You are measuring two things:
> >>
> >> 1. Rise in file descriptor count
> >> 2. Application slowness
> >>
> >> You are assuming that #1 is causing #2. It's entirely possible that #2
> >> is causing #1.
> >>
> >> The real question is "why is the application slowing down". Do you see
> >> CPU spikes? If not, check your db connections.
> >>
> >> If your db connection pool is fully-utilized (no more available), then
> >> you may have lots of request processing threads sitting there waiting on
> >> db connections. You'd see a rise in incoming connections (waiting) which
> >> aren't making any progress, and the application seems to "slow down",
> >> and there is a snowball effect where more requests means more waiting,
> >> and therefore more slowness. This would manifest as sloe response times
> >> without any CPU spike.
> >>
> >> You could also have a slow database and/or some other resource such as a
> >> downstream web service.
> >>
> >> I would investigate those options before trying to prove that fds don't
> >> scale on JVM or Linux (because they likely DO scale quite well).
> >>
> >>> I am not concerned about high open files as I do not see any errors
> >> related
> >>> to open files. Only side effect of  open files going above 700 is the
> >>> response from tomcat is slow. I checked if this is caused from elastic
> >>> search, aws cloud watch shows elastic search response is within 5
> >>> milliseconds.
> >>>
> >>> what might be the reason that when the open files goes beyond 600, it
> >> slows
> >>> down the response time for tomcat. I tried with tomcat 9 and it's the
> >> same
> >>> behavior
> >>
> >> You might want to add some debug logging to your application when
> >> getting ready to contact e.g. a database or remote service. Something
> like:
> >>
> >> [timestamp] [thread-id] DEBUG Making call to X
> >> [timestamp] [thread-id] DEBUG Completed call to X
> >>
> >> or
> >>
> >> [timestamp] [thread-id] DEBUG Call to X took [duration]ms
> >>
> >> Then have a look at all those logs when the appli

Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Christopher Schultz

Ayub,

On 11/12/20 10:47, Ayub Khan wrote:

Chris,

I am using hikaricp connection pooling and the maximum pool size is set to
100, without specifying minimum idle connections. Even during high load I
see there are more than 80 connections in idle state.

I have setup debug statements to print the total time taken to complete the
request. The response time of completed call during load is around 5
seconds, the response time without load is around 400 to 500 milliseconds


That's a significant difference. Is your database server showing high 
CPU usage or more I/O usage during those high-load times?



During the load I cannot even access static html page


Now *that* is an interesting data point.

You are sure that the "static" request doesn't hit any other resources? 
No filter is doing anything? No logging to an external service or 
double-checking any security constraints in the db before serving the page?


(And the static page is being returned by Tomcat, not nginx, right?)

-chris


On Thu, Nov 12, 2020 at 4:59 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:


Ayub,

On 11/11/20 16:16, Ayub Khan wrote:

I was load testing using the ec2 load balancer dns. I have increased the
connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I am
not seeing connection timeout in nginx logs now. No errors in kernel.log

I

am not seeing any errors in tomcat catalina.out.


The timeouts are most likely related to the connection timeout (and
therefore keepalive) setting. If you are proxying connections from nginx
and they should be staying open, you should really never be experiencing
a timeout between nginx and Tomcat.


During regular operations when the request count is between 4 to 6k
requests per minute the open files count for the tomcat process is

between

200 to 350. Responses from tomcat are within 5 seconds.


Good.


If the requests count goes beyond 6.5 k open files slowly move up  to

2300

to 3000 and the request responses from tomcat become slow.


This is pretty important, here. You are measuring two things:

1. Rise in file descriptor count
2. Application slowness

You are assuming that #1 is causing #2. It's entirely possible that #2
is causing #1.

The real question is "why is the application slowing down". Do you see
CPU spikes? If not, check your db connections.

If your db connection pool is fully-utilized (no more available), then
you may have lots of request processing threads sitting there waiting on
db connections. You'd see a rise in incoming connections (waiting) which
aren't making any progress, and the application seems to "slow down",
and there is a snowball effect where more requests means more waiting,
and therefore more slowness. This would manifest as sloe response times
without any CPU spike.

You could also have a slow database and/or some other resource such as a
downstream web service.

I would investigate those options before trying to prove that fds don't
scale on JVM or Linux (because they likely DO scale quite well).


I am not concerned about high open files as I do not see any errors

related

to open files. Only side effect of  open files going above 700 is the
response from tomcat is slow. I checked if this is caused from elastic
search, aws cloud watch shows elastic search response is within 5
milliseconds.

what might be the reason that when the open files goes beyond 600, it

slows

down the response time for tomcat. I tried with tomcat 9 and it's the

same

behavior


You might want to add some debug logging to your application when
getting ready to contact e.g. a database or remote service. Something like:

[timestamp] [thread-id] DEBUG Making call to X
[timestamp] [thread-id] DEBUG Completed call to X

or

[timestamp] [thread-id] DEBUG Call to X took [duration]ms

Then have a look at all those logs when the applications slows down and
see if you can observe a significant jump in the time-to-complete those
operations.

Hope that helps,
-chris

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org






-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Ayub Khan
Chris,

I am using hikaricp connection pooling and the maximum pool size is set to
100, without specifying minimum idle connections. Even during high load I
see there are more than 80 connections in idle state.

I have setup debug statements to print the total time taken to complete the
request. The response time of completed call during load is around 5
seconds, the response time without load is around 400 to 500 milliseconds

During the load I cannot even access static html page

Using Jmeter, I executed 1500 requests to AWS elastic load balancer which
had only one VM instance of ninx--> tomcat  on the same VM and tomcat
consumed total memory of 30Gig and CPU was at 28% t

On Thu, Nov 12, 2020 at 6:47 PM Ayub Khan  wrote:

> Chris,
>
> I am using hikaricp connection pooling and the maximum pool size is set to
> 100, without specifying minimum idle connections. Even during high load I
> see there are more than 80 connections in idle state.
>
> I have setup debug statements to print the total time taken to complete
> the request. The response time of completed call during load is around 5
> seconds, the response time without load is around 400 to 500 milliseconds
>
> During the load I cannot even access static html page
>
>
>
>
>
>
> On Thu, Nov 12, 2020 at 4:59 PM Christopher Schultz <
> ch...@christopherschultz.net> wrote:
>
>> Ayub,
>>
>> On 11/11/20 16:16, Ayub Khan wrote:
>> > I was load testing using the ec2 load balancer dns. I have increased the
>> > connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I am
>> > not seeing connection timeout in nginx logs now. No errors in
>> kernel.log I
>> > am not seeing any errors in tomcat catalina.out.
>>
>> The timeouts are most likely related to the connection timeout (and
>> therefore keepalive) setting. If you are proxying connections from nginx
>> and they should be staying open, you should really never be experiencing
>> a timeout between nginx and Tomcat.
>>
>> > During regular operations when the request count is between 4 to 6k
>> > requests per minute the open files count for the tomcat process is
>> between
>> > 200 to 350. Responses from tomcat are within 5 seconds.
>>
>> Good.
>>
>> > If the requests count goes beyond 6.5 k open files slowly move up  to
>> 2300
>> > to 3000 and the request responses from tomcat become slow.
>>
>> This is pretty important, here. You are measuring two things:
>>
>> 1. Rise in file descriptor count
>> 2. Application slowness
>>
>> You are assuming that #1 is causing #2. It's entirely possible that #2
>> is causing #1.
>>
>> The real question is "why is the application slowing down". Do you see
>> CPU spikes? If not, check your db connections.
>>
>> If your db connection pool is fully-utilized (no more available), then
>> you may have lots of request processing threads sitting there waiting on
>> db connections. You'd see a rise in incoming connections (waiting) which
>> aren't making any progress, and the application seems to "slow down",
>> and there is a snowball effect where more requests means more waiting,
>> and therefore more slowness. This would manifest as sloe response times
>> without any CPU spike.
>>
>> You could also have a slow database and/or some other resource such as a
>> downstream web service.
>>
>> I would investigate those options before trying to prove that fds don't
>> scale on JVM or Linux (because they likely DO scale quite well).
>>
>> > I am not concerned about high open files as I do not see any errors
>> related
>> > to open files. Only side effect of  open files going above 700 is the
>> > response from tomcat is slow. I checked if this is caused from elastic
>> > search, aws cloud watch shows elastic search response is within 5
>> > milliseconds.
>> >
>> > what might be the reason that when the open files goes beyond 600, it
>> slows
>> > down the response time for tomcat. I tried with tomcat 9 and it's the
>> same
>> > behavior
>>
>> You might want to add some debug logging to your application when
>> getting ready to contact e.g. a database or remote service. Something
>> like:
>>
>> [timestamp] [thread-id] DEBUG Making call to X
>> [timestamp] [thread-id] DEBUG Completed call to X
>>
>> or
>>
>> [timestamp] [thread-id] DEBUG Call to X took [duration]ms
>>
>> Then have a look at all those logs when the applications slows down and
>> see if you can observe a significant jump in the time-to-complete those
>> operations.
>>
>> Hope that helps,
>> -chris
>>
>> -
>> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
>> For additional commands, e-mail: users-h...@tomcat.apache.org
>>
>>
>
> --
> 
> Sun Certified Enterprise Architect 1.5
> Sun Certified Java Programmer 1.4
> Microsoft Certified Systems Engineer 2000
> http://in.linkedin.com/pub/ayub-khan/a/811/b81
> mobile:+966-502674604
> --

Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Ayub Khan
Chris,

I am using hikaricp connection pooling and the maximum pool size is set to
100, without specifying minimum idle connections. Even during high load I
see there are more than 80 connections in idle state.

I have setup debug statements to print the total time taken to complete the
request. The response time of completed call during load is around 5
seconds, the response time without load is around 400 to 500 milliseconds

During the load I cannot even access static html page






On Thu, Nov 12, 2020 at 4:59 PM Christopher Schultz <
ch...@christopherschultz.net> wrote:

> Ayub,
>
> On 11/11/20 16:16, Ayub Khan wrote:
> > I was load testing using the ec2 load balancer dns. I have increased the
> > connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I am
> > not seeing connection timeout in nginx logs now. No errors in kernel.log
> I
> > am not seeing any errors in tomcat catalina.out.
>
> The timeouts are most likely related to the connection timeout (and
> therefore keepalive) setting. If you are proxying connections from nginx
> and they should be staying open, you should really never be experiencing
> a timeout between nginx and Tomcat.
>
> > During regular operations when the request count is between 4 to 6k
> > requests per minute the open files count for the tomcat process is
> between
> > 200 to 350. Responses from tomcat are within 5 seconds.
>
> Good.
>
> > If the requests count goes beyond 6.5 k open files slowly move up  to
> 2300
> > to 3000 and the request responses from tomcat become slow.
>
> This is pretty important, here. You are measuring two things:
>
> 1. Rise in file descriptor count
> 2. Application slowness
>
> You are assuming that #1 is causing #2. It's entirely possible that #2
> is causing #1.
>
> The real question is "why is the application slowing down". Do you see
> CPU spikes? If not, check your db connections.
>
> If your db connection pool is fully-utilized (no more available), then
> you may have lots of request processing threads sitting there waiting on
> db connections. You'd see a rise in incoming connections (waiting) which
> aren't making any progress, and the application seems to "slow down",
> and there is a snowball effect where more requests means more waiting,
> and therefore more slowness. This would manifest as sloe response times
> without any CPU spike.
>
> You could also have a slow database and/or some other resource such as a
> downstream web service.
>
> I would investigate those options before trying to prove that fds don't
> scale on JVM or Linux (because they likely DO scale quite well).
>
> > I am not concerned about high open files as I do not see any errors
> related
> > to open files. Only side effect of  open files going above 700 is the
> > response from tomcat is slow. I checked if this is caused from elastic
> > search, aws cloud watch shows elastic search response is within 5
> > milliseconds.
> >
> > what might be the reason that when the open files goes beyond 600, it
> slows
> > down the response time for tomcat. I tried with tomcat 9 and it's the
> same
> > behavior
>
> You might want to add some debug logging to your application when
> getting ready to contact e.g. a database or remote service. Something like:
>
> [timestamp] [thread-id] DEBUG Making call to X
> [timestamp] [thread-id] DEBUG Completed call to X
>
> or
>
> [timestamp] [thread-id] DEBUG Call to X took [duration]ms
>
> Then have a look at all those logs when the applications slows down and
> see if you can observe a significant jump in the time-to-complete those
> operations.
>
> Hope that helps,
> -chris
>
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>
>

-- 

Sun Certified Enterprise Architect 1.5
Sun Certified Java Programmer 1.4
Microsoft Certified Systems Engineer 2000
http://in.linkedin.com/pub/ayub-khan/a/811/b81
mobile:+966-502674604
--
It is proved that Hard Work and kowledge will get you close but attitude
will get you there. However, it's the Love
of God that will put you over the top!!


Re: only for remote access

2020-11-12 Thread Christopher Schultz

Jürgen,

On 11/12/20 09:50, Jürgen Weber wrote:

Chris,

it is just authentication basic.

I definitely want authentication for remote access, but I had hoped I
could override this with a Valve for local access.

>

Anyway, I'll spare the two apps and do two Servlet mappings

/local
/remote

protect /remote with 
and check in the servlet code if Servlet Path == local && remote IP in
local network


You can definitely do that with the RemoteIPValve and/or RemoteIPFilter. 
No need to write any new code.



And I'll try to mod_rewrite /remote to /local if in local network.


That would work, but be aware of playing games with URL spaces. It can 
be a real pain in the neck to hit every case.


What's wrong with local users authenticating? I don't trust my network 
that much.


-chris


Am Do., 12. Nov. 2020 um 14:43 Uhr schrieb Christopher Schultz
:


Jürgen,

On 11/12/20 06:30, Jürgen Weber wrote:

I'd like to have web app security if accessed from outside the local network.

if (!local)
 check 


Is this possible? with RemoteHostValve ?


You cam simulate it, but you can't use  in web.xml
and also get a "local" carve-out for it.

What kind of  are you trying to remove?

Here are some options:

1. Review why you want to do this in the first place. What makes "local"
so special?

2. Deploy two instances of your application, one of which only allows
"local" access and does NOT have the  in web.xml.

3. Remove the  from web.xml completely, and use a
Filter/Valve to enforce your security policy.

-chris

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: only for remote access

2020-11-12 Thread Jürgen Weber
Chris,

it is just authentication basic.

I definitely want authentication for remote access, but I had hoped I
could override this with a Valve for local access.

Anyway, I'll spare the two apps and do two Servlet mappings

/local
/remote

protect /remote with 
and check in the servlet code if Servlet Path == local && remote IP in
local network

And I'll try to mod_rewrite /remote to /local if in local network.


Juergen

Am Do., 12. Nov. 2020 um 14:43 Uhr schrieb Christopher Schultz
:
>
> Jürgen,
>
> On 11/12/20 06:30, Jürgen Weber wrote:
> > I'd like to have web app security if accessed from outside the local 
> > network.
> >
> > if (!local)
> > check 
> >
> >
> > Is this possible? with RemoteHostValve ?
>
> You cam simulate it, but you can't use  in web.xml
> and also get a "local" carve-out for it.
>
> What kind of  are you trying to remove?
>
> Here are some options:
>
> 1. Review why you want to do this in the first place. What makes "local"
> so special?
>
> 2. Deploy two instances of your application, one of which only allows
> "local" access and does NOT have the  in web.xml.
>
> 3. Remove the  from web.xml completely, and use a
> Filter/Valve to enforce your security policy.
>
> -chris
>
> -
> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
> For additional commands, e-mail: users-h...@tomcat.apache.org
>

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



RE: Weirdest Tomcat Behavior Ever?

2020-11-12 Thread Eric Robinson
> -Original Message-
> From: Mark Thomas 
> Sent: Thursday, November 12, 2020 4:08 AM
> To: Tomcat Users List ; Eric Robinson
> 
> Subject: Re: Weirdest Tomcat Behavior Ever?
>
> On 11/11/2020 22:48, Eric Robinson wrote:
> >> -Original Message-
> >> From: Mark Thomas 
> >> Sent: Monday, November 9, 2020 5:59 AM
> >> To: users@tomcat.apache.org
> >> Subject: Re: Weirdest Tomcat Behavior Ever?
> >>
> >> Eric,
> >>
> >> Time to prune the history and provide another summary I think. This
> >> summary isn't complete. There is more information in the history of
> >> the thread. I'm trying to focus on what seems to be the key information.
> >>
> >
> > Hi Mark -- So sorry for going silent for a couple of days. Our organization 
> > is
> neck-deep in a huge compliance project. Combine that with this issue we're
> working on together, and it's Perfect Storm time around here. We have a big
> meeting with the client and vendor tomorrow about all this and I'm working
> like heck to prevent this important customer from jumping ship.
>
> Understood. Let me know if there is anything I can do to help.
>
> > Now back to it!
> >
> >>
> >> Overview:
> >> A small number of requests are receiving a completely empty (no
> >> headers, no body) response.
> >>
> >
> > Just a FIN packet and that's all.
>
> Agreed.
>
> >> Environment
> >> Tomcat 7.0.72
> >>  - BIO HTTP (issue also observed with NIO)
> >>  - Source unknown (probably ASF)
> >> Java 1.8.0_221, Oracle
> >> CentOS 7.5, Azure
> >> Nginx reverse proxy
> >>  - Using HTTP/1.0
> >>  - No keep-alive
> >>  - No compression
> >> No (known) environment changes in the time period where this issue
> >> started
>
> I keep coming back to this. Something triggered this problem (note that
> trigger not necessarily the same as root cause). Given that the app, Tomcat
> and JVM versions didn't change that again points to some other component.
>

Perfectly understandable. It's the oldest question in the diagnostic playbook. 
What changed? I wish I had an answer. Whatever it was, if impacted both 
upstream servers.

> Picking just one of the wild ideas I've had is there some sort of firewall, 
> IDS,
> IPS etc. that might be doing connection tracking and is, for some reason,
> getting it wrong and closing the connection in error?
>

Three is no firewall or IDS software running on the upstreams. The only thing 
that comes to mind that may have been installed during that timeframe is Sophos 
antivirus and Solar Winds RMM. Sophos was the first thing I disabled when I saw 
the packet issues.

> As an aside, I mentioned earlier in this thread a similar issue we have been
> observing in the CI system. I tracked that down yesterday and I am certain
> the issues are unrelated. The CI issue was NIO specific (we see this issue 
> with
> BIO and NIO) and introduced by refactoring in 8.5.x (we see this issue in
> 7.0.x). Sorry this doesn't help.
>
> >> Results from debug logging
> >> - The request is read without error
> >> - The connection close is initiated from the Tomcat/Java side
> >> - The socket is closed before Tomcat tries to write the response
> >> - The application is not triggering the close of the socket
> >> - Tomcat is not triggering the close of the socket
> >> - When Tomcat does try and write we see the following exception
> >> java.net.SocketException: Bad file descriptor (Write failed)
> >>
> >> We have confirmed that the Java process is not hitting the limit for
> >> file descriptors.
> >>
> >> The file descriptor must have been valid when the request was read
> >> from the socket.
> >>
> >> The first debug log shows 2 other active connections from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >> The second debug log shows 1 other active connection from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >> The third debug log shows 1 other active connection from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >> The fourth debug log shows no other active connection from Nginx to
> >> Tomcat at the point the connection is closed unexpectedly.
> >>
> >>
> >> Analysis
> >>
> >> We know the connection close isn't coming from Tomcat or the
> application.
> >> That leaves:
> >> - the JVM
> >> - the OS
> >> - the virtualisation layer (since this is Azure I am assuming there is
> >>   one)
> >>
> >> We are approaching the limit of what we can debug via Tomcat (and my
> >> area of expertise. The evidence so far is pointing to an issue lower
> >> down the network stack (JVM, OS or virtualisation layer).
> >>
> >
> > Can't disagree with you there.
> >
> >> I think the next, and possibly last, thing we can do from Tomcat is
> >> log some information on the file descriptor associated with the
> >> socket. That is going to require some reflection to read JVM internals.
> >>
> >> Patch files here:
> >> http://home.apache.org/~markt/dev/v7.0.72-custom-patch-v4/
> >>
> >> Source code her

Re: Something I still don't quite understand, Re: Let's Encrypt with Tomcat behind httpd

2020-11-12 Thread Christopher Schultz

James,

On 11/5/20 12:07, James H. H. Lampert wrote:

I'm intrigued by Mr. Schultz's suggestion of


Maybe you just want RedirectPermanent instead of
Rewrite(Cond|Rule)?


Would that make a difference? Or is it just a matter of altering the 
RewriteCond clause to specifically ignore anything that looks like a 
Let's Encrypt challenge? Or is there something I can put on the default 
landing page for the subdomain, rather than in the VirtualHost, to cause 
the redirection?


I'm just thinking that Redirect[*] is a simpler configuration than 
Rewrite(Cond|Rule).


As I recall (unless there's a way to force-expire the cached challenge 
result on a certbot call), I have to wait until December to run another 
test.


You can delete all your stuff, but LE will get upset if you make 
requests too frequently. There is a way to ask LE to let you "test" 
stuff and they will lower the frequency limits. I have forgotten how to 
do that, but it might be a good idea to look into it since you really 
are testing things at this point.


-chris

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Christopher Schultz

Ayub,

On 11/11/20 16:16, Ayub Khan wrote:

I was load testing using the ec2 load balancer dns. I have increased the
connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I am
not seeing connection timeout in nginx logs now. No errors in kernel.log I
am not seeing any errors in tomcat catalina.out.


The timeouts are most likely related to the connection timeout (and 
therefore keepalive) setting. If you are proxying connections from nginx 
and they should be staying open, you should really never be experiencing 
a timeout between nginx and Tomcat.



During regular operations when the request count is between 4 to 6k
requests per minute the open files count for the tomcat process is between
200 to 350. Responses from tomcat are within 5 seconds.


Good.


If the requests count goes beyond 6.5 k open files slowly move up  to 2300
to 3000 and the request responses from tomcat become slow.


This is pretty important, here. You are measuring two things:

1. Rise in file descriptor count
2. Application slowness

You are assuming that #1 is causing #2. It's entirely possible that #2 
is causing #1.


The real question is "why is the application slowing down". Do you see 
CPU spikes? If not, check your db connections.


If your db connection pool is fully-utilized (no more available), then 
you may have lots of request processing threads sitting there waiting on 
db connections. You'd see a rise in incoming connections (waiting) which 
aren't making any progress, and the application seems to "slow down", 
and there is a snowball effect where more requests means more waiting, 
and therefore more slowness. This would manifest as sloe response times 
without any CPU spike.


You could also have a slow database and/or some other resource such as a 
downstream web service.


I would investigate those options before trying to prove that fds don't 
scale on JVM or Linux (because they likely DO scale quite well).



I am not concerned about high open files as I do not see any errors related
to open files. Only side effect of  open files going above 700 is the
response from tomcat is slow. I checked if this is caused from elastic
search, aws cloud watch shows elastic search response is within 5
milliseconds.

what might be the reason that when the open files goes beyond 600, it slows
down the response time for tomcat. I tried with tomcat 9 and it's the same
behavior


You might want to add some debug logging to your application when 
getting ready to contact e.g. a database or remote service. Something like:


[timestamp] [thread-id] DEBUG Making call to X
[timestamp] [thread-id] DEBUG Completed call to X

or

[timestamp] [thread-id] DEBUG Call to X took [duration]ms

Then have a look at all those logs when the applications slows down and 
see if you can observe a significant jump in the time-to-complete those 
operations.


Hope that helps,
-chris

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Ayub Khan
Mark,

The difference between after_start and after_load is the below sockets
which is just a sample from the repeated list, the ports are random. How to
know what these connections are related to ?

java5021 tomcat8 3162u IPv6  98361   0t0 TCP
localhost:http-alt->localhost:51746 (ESTABLISHED)
java5021 tomcat8 3163u IPv6  98362   0t0 TCP
localhost:http-alt->localhost:51748 (ESTABLISHED)
java5021 tomcat8 3164u IPv6  98363   0t0 TCP
localhost:http-alt->localhost:51750 (ESTABLISHED)
java5021 tomcat8 3165u IPv6  98364   0t0 TCP
localhost:http-alt->localhost:51752 (ESTABLISHED)
java5021 tomcat8 3166u IPv6  25334   0t0 TCP
localhost:http-alt->localhost:51754 (ESTABLISHED)
java5021 tomcat8 3167u IPv6  25335   0t0 TCP
localhost:http-alt->localhost:51756 (ESTABLISHED)
java5021 tomcat8 3168u IPv6  25336   0t0 TCP
localhost:http-alt->localhost:51758 (ESTABLISHED)
java5021 tomcat8 3169u IPv6  25337   0t0 TCP
localhost:http-alt->localhost:51760 (ESTABLISHED)
java5021 tomcat8 3170u IPv6  25338   0t0 TCP
localhost:http-alt->localhost:51762 (ESTABLISHED)
java5021 tomcat8 3171u IPv6  25339   0t0 TCP
localhost:http-alt->localhost:51764 (ESTABLISHED)
java5021 tomcat8 3172u IPv6  25340   0t0 TCP
localhost:http-alt->localhost:51766 (ESTABLISHED)
java5021 tomcat8 3173u IPv6  25341   0t0 TCP
localhost:http-alt->localhost:51768 (ESTABLISHED)
java5021 tomcat8 3174u IPv6  25342   0t0 TCP
localhost:http-alt->localhost:51770 (ESTABLISHED)
java5021 tomcat8 3175u IPv6  25343   0t0 TCP
localhost:http-alt->localhost:51772 (ESTABLISHED)
java5021 tomcat8 3176u IPv6  25344   0t0 TCP
localhost:http-alt->localhost:51774 (ESTABLISHED)
java5021 tomcat8 3177u IPv6  25345   0t0 TCP
localhost:http-alt->localhost:51776 (ESTABLISHED)
java5021 tomcat8 3178u IPv6  25346   0t0 TCP
localhost:http-alt->localhost:51778 (ESTABLISHED)
java5021 tomcat8 3179u IPv6  25347   0t0 TCP
localhost:http-alt->localhost:51780 (ESTABLISHED)
java5021 tomcat8 3180u IPv6  25348   0t0 TCP
localhost:http-alt->localhost:51782 (ESTABLISHED)
java5021 tomcat8 3181u IPv6  25349   0t0 TCP
localhost:http-alt->localhost:51784 (ESTABLISHED)
java5021 tomcat8 3182u IPv6  25350   0t0 TCP
localhost:http-alt->localhost:51786 (ESTABLISHED)
java5021 tomcat8 3183u IPv6  25351   0t0 TCP
localhost:http-alt->localhost:51788 (ESTABLISHED)

On Thu, Nov 12, 2020 at 4:05 PM Martin Grigorov 
wrote:

> On Thu, Nov 12, 2020 at 2:40 PM Ayub Khan  wrote:
>
> > Martin,
> >
> > Could you provide me a command which you want me to run and provide you
> the
> > results which might help you to debug this issue ?
> >
>
> 1) start your app and click around to load the usual FDs
> 2) lsof -p `cat /var/run/tomcat8.pid` > after_start.txt
> 3) load your app
> 4) lsof -p `cat /var/run/tomcat8.pid` > after_load.txt
>
> you can analyze the differences in the files yourself before sending them
> to us :-)
>
>
> >
> >
> > On Thu, Nov 12, 2020 at 1:36 PM Martin Grigorov 
> > wrote:
> >
> > > On Thu, Nov 12, 2020 at 10:37 AM Ayub Khan  wrote:
> > >
> > > > Martin,
> > > >
> > > > These are file descriptors, some are related to the jar files which
> are
> > > > included in the web application and some are related to the sockets
> > from
> > > > nginx to tomcat and some are related to database connections. I use
> the
> > > > below command to count the open file descriptors
> > > >
> > >
> > > which type of connections increase ?
> > > the sockets ? the DB ones ?
> > >
> > >
> > > >
> > > > watch "sudo ls /proc/`cat /var/run/tomcat8.pid`/fd/ | wc -l"
> > > >
> > >
> > > you can also use lsof command
> > >
> > >
> > > >
> > > >
> > > >
> > > > On Thu, Nov 12, 2020 at 10:56 AM Martin Grigorov <
> mgrigo...@apache.org
> > >
> > > > wrote:
> > > >
> > > > > On Wed, Nov 11, 2020 at 11:17 PM Ayub Khan 
> > wrote:
> > > > >
> > > > > > Chris,
> > > > > >
> > > > > > I was load testing using the ec2 load balancer dns. I have
> > increased
> > > > the
> > > > > > connector timeout to 6000 and also gave 32gig to the JVM of
> > tomcat. I
> > > > am
> > > > > > not seeing connection timeout in nginx logs now. No errors in
> > > > kernel.log
> > > > > I
> > > > > > am not seeing any errors in tomcat catalina.out.
> > > > > > During regular operations when the request count is between 4 to
> 6k
> > > > > > requests per minute the open files count for the tomcat process
> is
> > > > > between
> > > > > > 200 to 350. Responses from tom

Re: only for remote access

2020-11-12 Thread Christopher Schultz

Jürgen,

On 11/12/20 06:30, Jürgen Weber wrote:

I'd like to have web app security if accessed from outside the local network.

if (!local)
check 


Is this possible? with RemoteHostValve ?


You cam simulate it, but you can't use  in web.xml 
and also get a "local" carve-out for it.


What kind of  are you trying to remove?

Here are some options:

1. Review why you want to do this in the first place. What makes "local" 
so special?


2. Deploy two instances of your application, one of which only allows 
"local" access and does NOT have the  in web.xml.


3. Remove the  from web.xml completely, and use a 
Filter/Valve to enforce your security policy.


-chris

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Martin Grigorov
On Thu, Nov 12, 2020 at 2:40 PM Ayub Khan  wrote:

> Martin,
>
> Could you provide me a command which you want me to run and provide you the
> results which might help you to debug this issue ?
>

1) start your app and click around to load the usual FDs
2) lsof -p `cat /var/run/tomcat8.pid` > after_start.txt
3) load your app
4) lsof -p `cat /var/run/tomcat8.pid` > after_load.txt

you can analyze the differences in the files yourself before sending them
to us :-)


>
>
> On Thu, Nov 12, 2020 at 1:36 PM Martin Grigorov 
> wrote:
>
> > On Thu, Nov 12, 2020 at 10:37 AM Ayub Khan  wrote:
> >
> > > Martin,
> > >
> > > These are file descriptors, some are related to the jar files which are
> > > included in the web application and some are related to the sockets
> from
> > > nginx to tomcat and some are related to database connections. I use the
> > > below command to count the open file descriptors
> > >
> >
> > which type of connections increase ?
> > the sockets ? the DB ones ?
> >
> >
> > >
> > > watch "sudo ls /proc/`cat /var/run/tomcat8.pid`/fd/ | wc -l"
> > >
> >
> > you can also use lsof command
> >
> >
> > >
> > >
> > >
> > > On Thu, Nov 12, 2020 at 10:56 AM Martin Grigorov  >
> > > wrote:
> > >
> > > > On Wed, Nov 11, 2020 at 11:17 PM Ayub Khan 
> wrote:
> > > >
> > > > > Chris,
> > > > >
> > > > > I was load testing using the ec2 load balancer dns. I have
> increased
> > > the
> > > > > connector timeout to 6000 and also gave 32gig to the JVM of
> tomcat. I
> > > am
> > > > > not seeing connection timeout in nginx logs now. No errors in
> > > kernel.log
> > > > I
> > > > > am not seeing any errors in tomcat catalina.out.
> > > > > During regular operations when the request count is between 4 to 6k
> > > > > requests per minute the open files count for the tomcat process is
> > > > between
> > > > > 200 to 350. Responses from tomcat are within 5 seconds.
> > > > > If the requests count goes beyond 6.5 k open files slowly move up
> to
> > > > 2300
> > > > > to 3000 and the request responses from tomcat become slow.
> > > > >
> > > > > I am not concerned about high open files as I do not see any errors
> > > > related
> > > > > to open files. Only side effect of  open files going above 700 is
> the
> > > > > response from tomcat is slow. I checked if this is caused from
> > elastic
> > > > > search, aws cloud watch shows elastic search response is within 5
> > > > > milliseconds.
> > > > >
> > > > > what might be the reason that when the open files goes beyond 600,
> it
> > > > slows
> > > > > down the response time for tomcat. I tried with tomcat 9 and it's
> the
> > > > same
> > > > > behavior
> > > > >
> > > >
> > > > Do you know what kind of files are being opened ?
> > > >
> > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Nov 3, 2020 at 9:40 PM Christopher Schultz <
> > > > > ch...@christopherschultz.net> wrote:
> > > > >
> > > > > > Ayub,
> > > > > >
> > > > > > On 11/3/20 10:56, Ayub Khan wrote:
> > > > > > > *I'm curious about why you are using all of cloudflare and ALB
> > and
> > > > > > > nginx.Seems like any one of those could provide what you are
> > > getting
> > > > > from
> > > > > > > all3 of them. *
> > > > > > >
> > > > > > > Cloudflare is doing just the DNS and nginx is doing ssl
> > termination
> > > > > >
> > > > > > What do you mean "Cloudflare is doing just the DNS?"
> > > > > >
> > > > > > So what is ALB doing, then?
> > > > > >
> > > > > > > *What is the maximum number of simultaneous requests that one
> > > > > > nginxinstance
> > > > > > > will accept? What is the maximum number of simultaneous
> > > > proxiedrequests
> > > > > > one
> > > > > > > nginx instance will make to a back-end Tomcat node? Howmany
> nginx
> > > > nodes
> > > > > > do
> > > > > > > you have? How many Tomcat nodes?  *
> > > > > > >
> > > > > > > We have 4 vms each having nginx and tomcat running on them and
> > each
> > > > > > tomcat
> > > > > > > has nginx in front of them to proxy the requests. So it's one
> > Nginx
> > > > > > > proxying to a dedicated tomcat on the same VM.
> > > > > >
> > > > > > Okay.
> > > > > >
> > > > > > > below is the tomcat connector configuration
> > > > > > >
> > > > > > >  > > > > > > connectionTimeout="6" maxThreads="2000"
> > > > > > >
> > >  protocol="org.apache.coyote.http11.Http11NioProtocol"
> > > > > > > URIEncoding="UTF-8"
> > > > > > > redirectPort="8443" />
> > > > > >
> > > > > > 60 seconds is a *long* time for a connection timeout.
> > > > > >
> > > > > > Do you actually need 2000 threads? That's a lot, though not
> insane.
> > > > 2000
> > > > > > threads means you expect to handle 2000 concurrent (non-async,
> > > > > > non-Wewbsocket) requests. Do you need that (per node)? Are you
> > > > expecting
> > > > > > 8000 concurrent requests? Does your load-balancer understand the
> > > > > > topography and current-load on any given node?
> > > > > >
> > > > > > > When I am doi

Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Ayub Khan
Martin,

Could you provide me a command which you want me to run and provide you the
results which might help you to debug this issue ?


On Thu, Nov 12, 2020 at 1:36 PM Martin Grigorov 
wrote:

> On Thu, Nov 12, 2020 at 10:37 AM Ayub Khan  wrote:
>
> > Martin,
> >
> > These are file descriptors, some are related to the jar files which are
> > included in the web application and some are related to the sockets from
> > nginx to tomcat and some are related to database connections. I use the
> > below command to count the open file descriptors
> >
>
> which type of connections increase ?
> the sockets ? the DB ones ?
>
>
> >
> > watch "sudo ls /proc/`cat /var/run/tomcat8.pid`/fd/ | wc -l"
> >
>
> you can also use lsof command
>
>
> >
> >
> >
> > On Thu, Nov 12, 2020 at 10:56 AM Martin Grigorov 
> > wrote:
> >
> > > On Wed, Nov 11, 2020 at 11:17 PM Ayub Khan  wrote:
> > >
> > > > Chris,
> > > >
> > > > I was load testing using the ec2 load balancer dns. I have increased
> > the
> > > > connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I
> > am
> > > > not seeing connection timeout in nginx logs now. No errors in
> > kernel.log
> > > I
> > > > am not seeing any errors in tomcat catalina.out.
> > > > During regular operations when the request count is between 4 to 6k
> > > > requests per minute the open files count for the tomcat process is
> > > between
> > > > 200 to 350. Responses from tomcat are within 5 seconds.
> > > > If the requests count goes beyond 6.5 k open files slowly move up  to
> > > 2300
> > > > to 3000 and the request responses from tomcat become slow.
> > > >
> > > > I am not concerned about high open files as I do not see any errors
> > > related
> > > > to open files. Only side effect of  open files going above 700 is the
> > > > response from tomcat is slow. I checked if this is caused from
> elastic
> > > > search, aws cloud watch shows elastic search response is within 5
> > > > milliseconds.
> > > >
> > > > what might be the reason that when the open files goes beyond 600, it
> > > slows
> > > > down the response time for tomcat. I tried with tomcat 9 and it's the
> > > same
> > > > behavior
> > > >
> > >
> > > Do you know what kind of files are being opened ?
> > >
> > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Nov 3, 2020 at 9:40 PM Christopher Schultz <
> > > > ch...@christopherschultz.net> wrote:
> > > >
> > > > > Ayub,
> > > > >
> > > > > On 11/3/20 10:56, Ayub Khan wrote:
> > > > > > *I'm curious about why you are using all of cloudflare and ALB
> and
> > > > > > nginx.Seems like any one of those could provide what you are
> > getting
> > > > from
> > > > > > all3 of them. *
> > > > > >
> > > > > > Cloudflare is doing just the DNS and nginx is doing ssl
> termination
> > > > >
> > > > > What do you mean "Cloudflare is doing just the DNS?"
> > > > >
> > > > > So what is ALB doing, then?
> > > > >
> > > > > > *What is the maximum number of simultaneous requests that one
> > > > > nginxinstance
> > > > > > will accept? What is the maximum number of simultaneous
> > > proxiedrequests
> > > > > one
> > > > > > nginx instance will make to a back-end Tomcat node? Howmany nginx
> > > nodes
> > > > > do
> > > > > > you have? How many Tomcat nodes?  *
> > > > > >
> > > > > > We have 4 vms each having nginx and tomcat running on them and
> each
> > > > > tomcat
> > > > > > has nginx in front of them to proxy the requests. So it's one
> Nginx
> > > > > > proxying to a dedicated tomcat on the same VM.
> > > > >
> > > > > Okay.
> > > > >
> > > > > > below is the tomcat connector configuration
> > > > > >
> > > > > >  > > > > > connectionTimeout="6" maxThreads="2000"
> > > > > >
> >  protocol="org.apache.coyote.http11.Http11NioProtocol"
> > > > > > URIEncoding="UTF-8"
> > > > > > redirectPort="8443" />
> > > > >
> > > > > 60 seconds is a *long* time for a connection timeout.
> > > > >
> > > > > Do you actually need 2000 threads? That's a lot, though not insane.
> > > 2000
> > > > > threads means you expect to handle 2000 concurrent (non-async,
> > > > > non-Wewbsocket) requests. Do you need that (per node)? Are you
> > > expecting
> > > > > 8000 concurrent requests? Does your load-balancer understand the
> > > > > topography and current-load on any given node?
> > > > >
> > > > > > When I am doing a load test of 2000 concurrent users I see the
> open
> > > > files
> > > > > > increase to 10,320 and when I take thread dump I see the threads
> > are
> > > > in a
> > > > > > waiting state.Slowly as the requests are completed I see the open
> > > files
> > > > > > come down to normal levels.
> > > > >
> > > > > Are you performing your load-test against the CF/ALB/nginx/Tomcat
> > > stack,
> > > > > or just hitting Tomcat (or nginx) directly?
> > > > >
> > > > > Are you using HTTP keepalive in your load-test (from the client to
> > > > > whichever server is being contacted)?
> > > > >
> > > > > > The output o

only for remote access

2020-11-12 Thread Jürgen Weber
Hi,

I'd like to have web app security if accessed from outside the local network.

if (!local)
   check 


Is this possible? with RemoteHostValve ?

Thx,
Juergen

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Martin Grigorov
On Thu, Nov 12, 2020 at 10:37 AM Ayub Khan  wrote:

> Martin,
>
> These are file descriptors, some are related to the jar files which are
> included in the web application and some are related to the sockets from
> nginx to tomcat and some are related to database connections. I use the
> below command to count the open file descriptors
>

which type of connections increase ?
the sockets ? the DB ones ?


>
> watch "sudo ls /proc/`cat /var/run/tomcat8.pid`/fd/ | wc -l"
>

you can also use lsof command


>
>
>
> On Thu, Nov 12, 2020 at 10:56 AM Martin Grigorov 
> wrote:
>
> > On Wed, Nov 11, 2020 at 11:17 PM Ayub Khan  wrote:
> >
> > > Chris,
> > >
> > > I was load testing using the ec2 load balancer dns. I have increased
> the
> > > connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I
> am
> > > not seeing connection timeout in nginx logs now. No errors in
> kernel.log
> > I
> > > am not seeing any errors in tomcat catalina.out.
> > > During regular operations when the request count is between 4 to 6k
> > > requests per minute the open files count for the tomcat process is
> > between
> > > 200 to 350. Responses from tomcat are within 5 seconds.
> > > If the requests count goes beyond 6.5 k open files slowly move up  to
> > 2300
> > > to 3000 and the request responses from tomcat become slow.
> > >
> > > I am not concerned about high open files as I do not see any errors
> > related
> > > to open files. Only side effect of  open files going above 700 is the
> > > response from tomcat is slow. I checked if this is caused from elastic
> > > search, aws cloud watch shows elastic search response is within 5
> > > milliseconds.
> > >
> > > what might be the reason that when the open files goes beyond 600, it
> > slows
> > > down the response time for tomcat. I tried with tomcat 9 and it's the
> > same
> > > behavior
> > >
> >
> > Do you know what kind of files are being opened ?
> >
> >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Nov 3, 2020 at 9:40 PM Christopher Schultz <
> > > ch...@christopherschultz.net> wrote:
> > >
> > > > Ayub,
> > > >
> > > > On 11/3/20 10:56, Ayub Khan wrote:
> > > > > *I'm curious about why you are using all of cloudflare and ALB and
> > > > > nginx.Seems like any one of those could provide what you are
> getting
> > > from
> > > > > all3 of them. *
> > > > >
> > > > > Cloudflare is doing just the DNS and nginx is doing ssl termination
> > > >
> > > > What do you mean "Cloudflare is doing just the DNS?"
> > > >
> > > > So what is ALB doing, then?
> > > >
> > > > > *What is the maximum number of simultaneous requests that one
> > > > nginxinstance
> > > > > will accept? What is the maximum number of simultaneous
> > proxiedrequests
> > > > one
> > > > > nginx instance will make to a back-end Tomcat node? Howmany nginx
> > nodes
> > > > do
> > > > > you have? How many Tomcat nodes?  *
> > > > >
> > > > > We have 4 vms each having nginx and tomcat running on them and each
> > > > tomcat
> > > > > has nginx in front of them to proxy the requests. So it's one Nginx
> > > > > proxying to a dedicated tomcat on the same VM.
> > > >
> > > > Okay.
> > > >
> > > > > below is the tomcat connector configuration
> > > > >
> > > > >  > > > > connectionTimeout="6" maxThreads="2000"
> > > > >
>  protocol="org.apache.coyote.http11.Http11NioProtocol"
> > > > > URIEncoding="UTF-8"
> > > > > redirectPort="8443" />
> > > >
> > > > 60 seconds is a *long* time for a connection timeout.
> > > >
> > > > Do you actually need 2000 threads? That's a lot, though not insane.
> > 2000
> > > > threads means you expect to handle 2000 concurrent (non-async,
> > > > non-Wewbsocket) requests. Do you need that (per node)? Are you
> > expecting
> > > > 8000 concurrent requests? Does your load-balancer understand the
> > > > topography and current-load on any given node?
> > > >
> > > > > When I am doing a load test of 2000 concurrent users I see the open
> > > files
> > > > > increase to 10,320 and when I take thread dump I see the threads
> are
> > > in a
> > > > > waiting state.Slowly as the requests are completed I see the open
> > files
> > > > > come down to normal levels.
> > > >
> > > > Are you performing your load-test against the CF/ALB/nginx/Tomcat
> > stack,
> > > > or just hitting Tomcat (or nginx) directly?
> > > >
> > > > Are you using HTTP keepalive in your load-test (from the client to
> > > > whichever server is being contacted)?
> > > >
> > > > > The output of the below command is
> > > > > sudo cat /proc/sys/kernel/pid_max
> > > > > 131072
> > > > >
> > > > > I am testing this on a c4.8xlarge VM in AWS.
> > > > >
> > > > > below is the config I changed in nginx.conf file
> > > > >
> > > > > events {
> > > > >  worker_connections 5;
> > > > >  # multi_accept on;
> > > > > }
> > > >
> > > > This will allow 50k incoming connections, and Tomcat will accept an
> > > > unbounded number of connections (for NIO connect

Re: Weirdest Tomcat Behavior Ever?

2020-11-12 Thread Mark Thomas
On 11/11/2020 22:48, Eric Robinson wrote:
>> -Original Message-
>> From: Mark Thomas 
>> Sent: Monday, November 9, 2020 5:59 AM
>> To: users@tomcat.apache.org
>> Subject: Re: Weirdest Tomcat Behavior Ever?
>>
>> Eric,
>>
>> Time to prune the history and provide another summary I think. This
>> summary isn't complete. There is more information in the history of the
>> thread. I'm trying to focus on what seems to be the key information.
>>
> 
> Hi Mark -- So sorry for going silent for a couple of days. Our organization 
> is neck-deep in a huge compliance project. Combine that with this issue we're 
> working on together, and it's Perfect Storm time around here. We have a big 
> meeting with the client and vendor tomorrow about all this and I'm working 
> like heck to prevent this important customer from jumping ship.

Understood. Let me know if there is anything I can do to help.

> Now back to it!
> 
>>
>> Overview:
>> A small number of requests are receiving a completely empty (no headers,
>> no body) response.
>>
> 
> Just a FIN packet and that's all.

Agreed.

>> Environment
>> Tomcat 7.0.72
>>  - BIO HTTP (issue also observed with NIO)
>>  - Source unknown (probably ASF)
>> Java 1.8.0_221, Oracle
>> CentOS 7.5, Azure
>> Nginx reverse proxy
>>  - Using HTTP/1.0
>>  - No keep-alive
>>  - No compression
>> No (known) environment changes in the time period where this issue started

I keep coming back to this. Something triggered this problem (note that
trigger not necessarily the same as root cause). Given that the app,
Tomcat and JVM versions didn't change that again points to some other
component.

Picking just one of the wild ideas I've had is there some sort of
firewall, IDS, IPS etc. that might be doing connection tracking and is,
for some reason, getting it wrong and closing the connection in error?

As an aside, I mentioned earlier in this thread a similar issue we have
been observing in the CI system. I tracked that down yesterday and I am
certain the issues are unrelated. The CI issue was NIO specific (we see
this issue with BIO and NIO) and introduced by refactoring in 8.5.x (we
see this issue in 7.0.x). Sorry this doesn't help.

>> Results from debug logging
>> - The request is read without error
>> - The connection close is initiated from the Tomcat/Java side
>> - The socket is closed before Tomcat tries to write the response
>> - The application is not triggering the close of the socket
>> - Tomcat is not triggering the close of the socket
>> - When Tomcat does try and write we see the following exception
>> java.net.SocketException: Bad file descriptor (Write failed)
>>
>> We have confirmed that the Java process is not hitting the limit for file
>> descriptors.
>>
>> The file descriptor must have been valid when the request was read from
>> the socket.
>>
>> The first debug log shows 2 other active connections from Nginx to Tomcat at
>> the point the connection is closed unexpectedly.
>>
>> The second debug log shows 1 other active connection from Nginx to Tomcat
>> at the point the connection is closed unexpectedly.
>>
>> The third debug log shows 1 other active connection from Nginx to Tomcat at
>> the point the connection is closed unexpectedly.
>>
>> The fourth debug log shows no other active connection from Nginx to
>> Tomcat at the point the connection is closed unexpectedly.
>>
>>
>> Analysis
>>
>> We know the connection close isn't coming from Tomcat or the application.
>> That leaves:
>> - the JVM
>> - the OS
>> - the virtualisation layer (since this is Azure I am assuming there is
>>   one)
>>
>> We are approaching the limit of what we can debug via Tomcat (and my area
>> of expertise. The evidence so far is pointing to an issue lower down the
>> network stack (JVM, OS or virtualisation layer).
>>
> 
> Can't disagree with you there.
> 
>> I think the next, and possibly last, thing we can do from Tomcat is log some
>> information on the file descriptor associated with the socket. That is going 
>> to
>> require some reflection to read JVM internals.
>>
>> Patch files here:
>> http://home.apache.org/~markt/dev/v7.0.72-custom-patch-v4/
>>
>> Source code here:
>> https://github.com/markt-asf/tomcat/tree/debug-7.0.72
>>
> 
> I will apply these tonight.
>
>> The file descriptor usage count is guarded by a lock object so this patch 
>> adds
>> quite a few syncs. For the load you are seeing that shouldn't an issue but
>> there is a change it will impact performance.
>>
> 
> Based on observation of load, I'm not too concerned about that. Maybe a 
> little. I'll keep an eye on it.
> 
>> The aim with this logging is to provide evidence of whether or not there is a
>> file descriptor handling problem in the JRE. My expectation is that with 
>> these
>> logs we will have reached the limit of what we can do with Tomcat but will be
>> able to point you in the right direction for further investigation.
>>
> 
> I'll get this done right away.

Thanks.

Mark

--

Re: Timeout waiting to read data from client

2020-11-12 Thread Mark Thomas
On 11/11/2020 22:32, Jerry Malcolm wrote:
> On 11/9/2020 11:05 AM, Jerry Malcolm wrote:
>>
>> On 11/9/2020 3:10 AM, Mark Thomas wrote:
>>> On 08/11/2020 01:33, Jerry Malcolm wrote:
 On 11/7/2020 6:56 PM, Christopher Schultz wrote:
> Jerry,
>
> On 11/6/20 19:49, Jerry Malcolm wrote:
>> I have a relatively new environment with a standalone tomcat (8.5)
>> running on an AWS Linux2 EC2.  I'm not using HTTPD/AJP. Its a direct
>> connection to port 443.  (Well technically, I have firewallD in the
>> flow in order to route the protected port 80 to port 8080 and 443 to
>> 8443 for TC).
>>
>> I am doing some stress testing on the server and failing miserably.
>> I am sending around 130 ajax calls in rapid succession using HTTP/2.
>> These are all very simple small page (JSP) requests.  Not a lot of
>> processing required. The first ~75 requests process normally.  Then
>> everything hangs up.  In the tomcat logs I'm getting a bunch of
>> "Timeout waiting to read data from client" exceptions. And in the
>> stacktrace for these exceptions, they are all occurring when I'm
>> trying to access a parameter from the request.  Looking at the
>> request network timing in the browser console, I see a bunch of the
>> requests returning in typical time of a few milliseconds. Then
>> another large block of requests that all start returning around 4
>> seconds, then another block that wait until 8 seconds to return.
>> I've tried firefox and chrome with the same results.
>>
>> I've been using httpd in front of TC for years.  So this is the first
>> time I'm running TC standalone.  It is very likely I've got some
>> parameters set horribly wrong.  But I have no clue where to start.
>> This is not a tiny EC2, and my internet connection is not showing any
>> signs of problems.  So I really don't think this is a
>> performance-related problem.  The problem is very consistent and
>> reproducible with the same counts of success/failure calls. What
>> could be causing the "Timeout waiting to read data from client" after
>> 75 calls, and then cause blocks of calls to wait 4 seconds, 8
>> seconds, etc before responding?  I really need to handle more
>> simultaneous load that this is currently allowing.
>>
>> Thanks in advance for the education.
> Are you using HTTP Keepalives on your connections? Are you actually
> re-using those connections in your test? What is your keepalive
> timeout on your . Actually, what is your whole 
> configuration?
>
> -chris
>
 Hi Chris, here are my two connector definitions from server.xml:

  >>>    port="8080"
    protocol="HTTP/1.1"
    connectionTimeout="2"
    redirectPort="443" />

  >>>    port="8443"
    maxThreads="150"
    connectionTimeout="2"
    SSLEnabled="true"
    scheme="https"
    secure="true"
    clientAuth="false"
    SSLCertificateFile="ssl/a.com/cert.pem"
    SSLCertificateChainFile="ssl/a.com/chain.pem"
    SSLCertificateKeyFile="ssl/a.com/privkey.pem">
  >>> className="org.apache.coyote.http2.Http2Protocol" />
  
>>> How are you stress testing this? All on a single HTTP/2 connection or
>>> multiple connections? With which tool?
>>>
>>> You might want to test HTTP/1.1 requests (with and without TLS) to see
>>> if the problem is specific to HTTP/2 or TLS as that should help narrow
>>> down the root cause.
>>>
>>> Mark
>>
>> Hi Mark, technically it's not a 'designed' stress test.  It's real
>> production code that just happens to stress the server more than
>> usual.  It's just a page that makes a bunch of ajax calls, and the
>> responses to each of those issue a second ajax call.
>>
>> If you don't see anything obvious in my configuration, we will
>> definitely pursue the http/1.1 options, etc.  I just wanted to
>> eliminate the chance of obvious 'pilot error' before digging deeper.
>>
>> Specifically, where is that error detected in the TC flow? In my logs
>> it fails on getting request parameters.  It sounds like the input
>> reader for the request is getting blocked.    But the first part of
>> the request is getting in since it does route to the appropriate JSP. 
>> Just seems strange that the http/2 or ssl layers would let half of the
>> request in and then block the rest of the request.  The browser
>> appears to be sending everything.  And it fails the same using firefox
>> or chrome.  Any ideas?
>>
>> Thx
>>
>>
> Update on this.  One of our clients got ERR_HTTP2_SERVER_REFUSED_STREAM
> after things locked up.  I removed the http2 'upgrade protocol' line
> from my connector, and everything works.  So it's apparently something
> wrong with my http2 setup.  Ideas? (See my connector config above in
> this thread).

Tomcat only issues that e

Re: NGINX + tomcat 8.0.35 (110: Connection timed out)

2020-11-12 Thread Ayub Khan
Martin,

These are file descriptors, some are related to the jar files which are
included in the web application and some are related to the sockets from
nginx to tomcat and some are related to database connections. I use the
below command to count the open file descriptors

watch "sudo ls /proc/`cat /var/run/tomcat8.pid`/fd/ | wc -l"



On Thu, Nov 12, 2020 at 10:56 AM Martin Grigorov 
wrote:

> On Wed, Nov 11, 2020 at 11:17 PM Ayub Khan  wrote:
>
> > Chris,
> >
> > I was load testing using the ec2 load balancer dns. I have increased the
> > connector timeout to 6000 and also gave 32gig to the JVM of tomcat. I am
> > not seeing connection timeout in nginx logs now. No errors in kernel.log
> I
> > am not seeing any errors in tomcat catalina.out.
> > During regular operations when the request count is between 4 to 6k
> > requests per minute the open files count for the tomcat process is
> between
> > 200 to 350. Responses from tomcat are within 5 seconds.
> > If the requests count goes beyond 6.5 k open files slowly move up  to
> 2300
> > to 3000 and the request responses from tomcat become slow.
> >
> > I am not concerned about high open files as I do not see any errors
> related
> > to open files. Only side effect of  open files going above 700 is the
> > response from tomcat is slow. I checked if this is caused from elastic
> > search, aws cloud watch shows elastic search response is within 5
> > milliseconds.
> >
> > what might be the reason that when the open files goes beyond 600, it
> slows
> > down the response time for tomcat. I tried with tomcat 9 and it's the
> same
> > behavior
> >
>
> Do you know what kind of files are being opened ?
>
>
> >
> >
> >
> >
> >
> >
> > On Tue, Nov 3, 2020 at 9:40 PM Christopher Schultz <
> > ch...@christopherschultz.net> wrote:
> >
> > > Ayub,
> > >
> > > On 11/3/20 10:56, Ayub Khan wrote:
> > > > *I'm curious about why you are using all of cloudflare and ALB and
> > > > nginx.Seems like any one of those could provide what you are getting
> > from
> > > > all3 of them. *
> > > >
> > > > Cloudflare is doing just the DNS and nginx is doing ssl termination
> > >
> > > What do you mean "Cloudflare is doing just the DNS?"
> > >
> > > So what is ALB doing, then?
> > >
> > > > *What is the maximum number of simultaneous requests that one
> > > nginxinstance
> > > > will accept? What is the maximum number of simultaneous
> proxiedrequests
> > > one
> > > > nginx instance will make to a back-end Tomcat node? Howmany nginx
> nodes
> > > do
> > > > you have? How many Tomcat nodes?  *
> > > >
> > > > We have 4 vms each having nginx and tomcat running on them and each
> > > tomcat
> > > > has nginx in front of them to proxy the requests. So it's one Nginx
> > > > proxying to a dedicated tomcat on the same VM.
> > >
> > > Okay.
> > >
> > > > below is the tomcat connector configuration
> > > >
> > > >  > > > connectionTimeout="6" maxThreads="2000"
> > > > protocol="org.apache.coyote.http11.Http11NioProtocol"
> > > > URIEncoding="UTF-8"
> > > > redirectPort="8443" />
> > >
> > > 60 seconds is a *long* time for a connection timeout.
> > >
> > > Do you actually need 2000 threads? That's a lot, though not insane.
> 2000
> > > threads means you expect to handle 2000 concurrent (non-async,
> > > non-Wewbsocket) requests. Do you need that (per node)? Are you
> expecting
> > > 8000 concurrent requests? Does your load-balancer understand the
> > > topography and current-load on any given node?
> > >
> > > > When I am doing a load test of 2000 concurrent users I see the open
> > files
> > > > increase to 10,320 and when I take thread dump I see the threads are
> > in a
> > > > waiting state.Slowly as the requests are completed I see the open
> files
> > > > come down to normal levels.
> > >
> > > Are you performing your load-test against the CF/ALB/nginx/Tomcat
> stack,
> > > or just hitting Tomcat (or nginx) directly?
> > >
> > > Are you using HTTP keepalive in your load-test (from the client to
> > > whichever server is being contacted)?
> > >
> > > > The output of the below command is
> > > > sudo cat /proc/sys/kernel/pid_max
> > > > 131072
> > > >
> > > > I am testing this on a c4.8xlarge VM in AWS.
> > > >
> > > > below is the config I changed in nginx.conf file
> > > >
> > > > events {
> > > >  worker_connections 5;
> > > >  # multi_accept on;
> > > > }
> > >
> > > This will allow 50k incoming connections, and Tomcat will accept an
> > > unbounded number of connections (for NIO connector). So limiting your
> > > threads to 2000 only means that the work of each request will be done
> in
> > > groups of 2000.
> > >
> > > > worker_rlimit_nofile 3;
> > >
> > > I'm not sure how many connections are handled by a single nginx worker.
> > > If you accept 50k connections and only allow 30k file handles, you may
> > > have a problem if that's all being done by a single worker.
> > >
> > > > What would b