Re: Performance issue 8.5.20 (metaspace related?)

2017-09-01 Thread Suvendu Sekhar Mondal
> Profiling the code I have been able to find the cause of the big metaspace 
> garbage…
> Due to a bug, we were not caching remote interfaces when connecting to jboss 
> from the web sites. Other client kind was ok.
> Fixed this problem today, it’s a few hours that the system is running fine. 
> It’s fast (faster than before even with a fresh tomcat); metaspace says low 
> and everything is fine.
>

That's a very good news!

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-31 Thread Ing. Andrea Vettori
> On 31 Aug 2017, at 17:55, Christopher Schultz  
> wrote:
> 
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> Andrea,
> 
> On 8/30/17 4:13 PM, Ing. Andrea Vettori wrote:
>>> On 30 Aug 2017, at 00:16, Christopher Schultz
>>>  wrote: RMI is known for flagrantly
>>> wasting permgen/metaspace because of all the Proxy objects and
>>> special throw-away Class objects it uses, etc. But I'm not sure
>>> why the metaspace filling up would have such a dramatic effect on
>>> performance. If you were stalling during a metaspace collection,
>>> it might make sense, but that's not what you are reporting.
>> 
>> That’s true in fact what I’d like to understand is if a big
>> metaspace can affect performance… one thing I want to try is to run
>> the app with metaspace capped at 400/500MB. On java 7 I had PermGen
>> at 512 and as far as I remember, it was working fine.
> 
> Yes, I think limiting the metaspace might help. It may result in more
> frequent, smaller GC cycles.
> 
> I actually don't know much about the (relatively) new metaspace
> region, but it's possible that GC in the metaspace is a "stop the
> world" operation where the JVM can't do anything else during that time.

I tried that this morning. I set one tomcat with max metaspace set to 512mb, 
left the other one with 2gb. The one with 512mb was faster to call jboss on 
average but triggered a lot more the gc so some calls where very slow (the ones 
when the gc triggered).

If you’ve seen my other reply the performance problem has been fixed but the 
curiosity to understand why big metaspace is slow remains…

Thanks


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-31 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Andrea,

On 8/30/17 4:13 PM, Ing. Andrea Vettori wrote:
>> On 30 Aug 2017, at 00:16, Christopher Schultz
>>  wrote: RMI is known for flagrantly
>> wasting permgen/metaspace because of all the Proxy objects and
>> special throw-away Class objects it uses, etc. But I'm not sure
>> why the metaspace filling up would have such a dramatic effect on
>> performance. If you were stalling during a metaspace collection,
>> it might make sense, but that's not what you are reporting.
> 
> That’s true in fact what I’d like to understand is if a big
> metaspace can affect performance… one thing I want to try is to run
> the app with metaspace capped at 400/500MB. On java 7 I had PermGen
> at 512 and as far as I remember, it was working fine.

Yes, I think limiting the metaspace might help. It may result in more
frequent, smaller GC cycles.

I actually don't know much about the (relatively) new metaspace
region, but it's possible that GC in the metaspace is a "stop the
world" operation where the JVM can't do anything else during that time.

- -chris
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJZqDF7AAoJEBzwKT+lPKRYBlcP/jVUU601Q0L1jszghHUx77Dq
5/SbTdPZassiPzLZ8Jg4NY9gclsFaQNHyvvi3p4dAbR5OdXjfogSgJEb1uFoLhKW
Uokxg7jd79HBVHvGjHKZ5NpNsLbun27DNEF+aLiVQzeuiJfK1j0aHDht8dFfR5Yq
Hm7UnKhxCNScDNZodvwEK1gPS5JgXVzi0qzdwE5fWVhTXeHDeYtgSjhlULv18NSj
7ug3S8OWNUmKfORnIDAcyKmJTNGX8BEfd9yrvQgq83nZciynu0S0AXVrdztQe0nv
3L2+cG94iEhhGWkwmKuGayeVX2BgUqtRQZ8ouNARnQIirmktUdPcd5j3UDtrCckh
siVVmKCv1EAsbtykW/0/VHUr9czPY95hocJTLVT0u+2eX+ez3rtC6zci6vV/kQYZ
XVbCnSDb0Bg5RxjFJe65f58X7SXnOwgWT/8sNjwYUYWVP/XWVFK00pgUbco+zyho
6dzI7pe9v2iU6v1JBf1yNq944SKRZONjklz0QkMSj6GsEcsEX3hepSh6NBL2oaDq
tyl45r/Xc1FipjnfevvMDJtLe3WiSEBKFHI0XEU64l29HzhbM+RzDzaLAYwmeo0+
f3wIDIP3wMX3lWyOgnMalqpNcyUf5ahVMMn6F63mJMvA37N55MDEToxwF1XbtZHB
6NClzihKaEu1PO3dACJm
=uCGN
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-31 Thread Ing. Andrea Vettori
> On 29 Aug 2017, at 14:24, Mark Thomas  wrote:
> 
> On 29/08/17 13:09, Ing. Andrea Vettori wrote:
>>> On 29 Aug 2017, at 12:29, Suvendu Sekhar Mondal  wrote:
>>> 
>>> On Tue, Aug 29, 2017 at 2:54 PM, Ing. Andrea Vettori
>>>  wrote:
 - with a fresh started tomcat instance, the time it takes is around 0,8 
 seconds. Most of the time is spent on the two RMI calls the task does.
 - with an instance that is running from some time, the time can reach 2/3 
 seconds; occasionally 5/6 seconds. Most time is still spent on RMI calls. 
 I.e. what slows down are the RMI calls.
>>> 
>>> Sill question, do you mean RMI calls generating from Tomcat is getting
>>> slower with time? or, JBoss is taking time to return the response?
>> 
>> Thanks for your help.
>> What I see is that the http requests are slower. Looking at the details of 
>> that specific request (putting a product in the cart) using System.nanotime 
>> I can see that most of the time is spent during the RMI calls. I’m pretty 
>> sure it’s not jboss that is slower because doing other calls at the same 
>> time with fresh tomcats or with a java client they're not slow.
> 
> I'd try profiling the problematic code. Given the magnitude of the
> times, sampling should be sufficient which means you could do this on
> production when the problem happens. You'll probably need to configure
> it to log JVM internal calls.


Profiling the code I have been able to find the cause of the big metaspace 
garbage…
Due to a bug, we were not caching remote interfaces when connecting to jboss 
from the web sites. Other client kind was ok.
Fixed this problem today, it’s a few hours that the system is running fine. 
It’s fast (faster than before even with a fresh tomcat); metaspace says low and 
everything is fine.

This solves my problem. Thanks to everybody that helped with comments and 
suggestions.

Still ’m very curios to know why big metaspace causes such increasing slowness… 
looking at the GC logs it seems not a gc problem… maybe metaspace handling by 
the jvm is not optimal when it’s big ?



-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-31 Thread Ing. Andrea Vettori
> On 31 Aug 2017, at 15:24, Suvendu Sekhar Mondal  wrote:
> 
> I will suggest that if you have some budget, please get a decent APM
> like AppDynamics, New Relic, Dynatrace to monitor your prod system
> 24x7x365. Trust me, you will be able to identify and solve this type
> of sporadic slowness issue very quickly.
> 


Thank you, I’m using ‘perfino’ since a couple days and it seems very good.


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-31 Thread Suvendu Sekhar Mondal
Andrea,

>> Sometimes Full GC were able to clean up some space from Metaspace but
>> only as part of a final last ditch collection effort:
>>
>> 43618.504: [Full GC (Last ditch collection)  1386M->250M(20G), 1.6455823 
>> secs]
>>   [Eden: 0.0B(6408.0M)->0.0B(6408.0M) Survivors: 0.0B->0.0B Heap:
>> 1386.7M(20.0G)->250.5M(20.0G)], [Metaspace:
>> 1646471K->163253K(1843200K)]
>> [Times: user=2.23 sys=0.10, real=1.65 secs]
>> 49034.392: [Full GC (Last ditch collection)  1491M->347M(20G), 1.9965534 
>> secs]
>>   [Eden: 0.0B(5600.0M)->0.0B(5600.0M) Survivors: 0.0B->0.0B Heap:
>> 1491.1M(20.0G)->347.9M(20.0G)], [Metaspace:
>> 1660804K->156199K(2031616K)]
>> [Times: user=2.78 sys=0.10, real=2.00 secs]
>
> This is interesting because I see this pattern for metaspace gc…
>
> https://ibb.co/b2B9HQ
>
> there’s something that clears it but it seems it’s not the full gc. I don’t 
> think there’s something that happens in the app that causes that much 
> metaspace  space to become instantly dead. So I think the full gc is not 
> doing the metaspace cleaning the way it could do it. Maybe it’s a light clean…
>

I believe MinMetaspaceFreeRatio and MaxMetaspaceFreeRatio playing a
role there. Default values for them in JDK 1.8 is 40 and 70
respectively. You can read about it more here:
https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/considerations.html

>> That's is the default value. Please see:
>> http://docs.oracle.com/javase/8/docs/technotes/guides/rmi/sunrmiproperties.html
>>  
>> 
>
> They must have changed it recently, it was 180k when I first checked that 
> option. Do you think it’s worth to increase it ? I don’t see a full gc every 
> hour…
>

I have seen that if you use G1 GC, then JVM completely ignores those
RMI DGC parameter values. If you use Parallel GC, then you will see
periodical Full GCs(System.gc) are getting triggered based on those
param values. I did not get chance to go in-depth about this behavior
of G1. As G1 is new and much more efficient than other collectors(I
know this statement can draw some arguments), probably it is cleaning
those RMI objects more efficiently.

I will suggest that if you have some budget, please get a decent APM
like AppDynamics, New Relic, Dynatrace to monitor your prod system
24x7x365. Trust me, you will be able to identify and solve this type
of sporadic slowness issue very quickly.

Thanks!
Suvendu

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-30 Thread Ing. Andrea Vettori

> On 30 Aug 2017, at 16:38, Suvendu Sekhar Mondal  wrote:
> 
> 
> In Hotspot VM, Full GC will collect "dead" objects/classes from all
> the regions(Young, Old, Metaspace/PermGen). In the provided GC log, I
> can see that you had triggered Full GCs but Metaspace did not shrunk
> much. That clearly indicates you had good number of classes that were
> "in use":
> 
> 48318.001: [Full GC (Heap Dump Initiated GC)  5232M->1298M(20G), 4.9784080 
> secs]
>   [Eden: 2800.0M(3136.0M)->0.0B(9616.0M) Survivors: 120.0M->0.0B
> Heap: 5232.7M(20.0G)->1298.9M(20.0G)], [Metaspace:
> 1419958K->1398648K(2621440K)]
> [Times: user=9.55 sys=0.09, real=4.98 secs]
> 48404.270: [Full GC (System.gc())  2825M->1301M(20G), 4.3863751 secs]
>   [Eden: 1528.0M(9616.0M)->0.0B(9616.0M) Survivors: 0.0B->0.0B Heap:
> 2825.2M(20.0G)->1301.6M(20.0G)], [Metaspace:
> 1425628K->1425623K(2627584K)]
> [Times: user=9.09 sys=0.15, real=4.38 secs]
> 
> Sometimes Full GC were able to clean up some space from Metaspace but
> only as part of a final last ditch collection effort:
> 
> 43618.504: [Full GC (Last ditch collection)  1386M->250M(20G), 1.6455823 secs]
>   [Eden: 0.0B(6408.0M)->0.0B(6408.0M) Survivors: 0.0B->0.0B Heap:
> 1386.7M(20.0G)->250.5M(20.0G)], [Metaspace:
> 1646471K->163253K(1843200K)]
> [Times: user=2.23 sys=0.10, real=1.65 secs]
> 49034.392: [Full GC (Last ditch collection)  1491M->347M(20G), 1.9965534 secs]
>   [Eden: 0.0B(5600.0M)->0.0B(5600.0M) Survivors: 0.0B->0.0B Heap:
> 1491.1M(20.0G)->347.9M(20.0G)], [Metaspace:
> 1660804K->156199K(2031616K)]
> [Times: user=2.78 sys=0.10, real=2.00 secs]

This is interesting because I see this pattern for metaspace gc… 

https://ibb.co/b2B9HQ

there’s something that clears it but it seems it’s not the full gc. I don’t 
think there’s something that happens in the app that causes that much metaspace 
 space to become instantly dead. So I think the full gc is not doing the 
metaspace cleaning the way it could do it. Maybe it’s a light clean…


 Tomcat is running with these VM options:
 
 -Xms20480m -Xmx20480m -Dsun.rmi.dgc.client.gcInterval=360 
 -Dsun.rmi.dgc.server.gcInterval=360 -Xloggc:/tmp/gc.txt 
 -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9012 
 -Dcom.sun.management.jmxremote.ssl=false 
 -Dcom.sun.management.jmxremote.authenticate=false -XX:+PrintGCDetails 
 -XX:+PrintGCTimeStamps -XX:+UseG1GC -XX:ReservedCodeCacheSize=1g 
 -XX:InitialCodeCacheSize=256m -XX:+UseHugeTLBFS -XX:MetaspaceSize=1g 
 -XX:MaxMetaspaceSize=2g
 
 Some of the options have been recently added (for example the increase in 
 code cache  size) but it seems they had no impact on the issue.
 
>>> 
>>> Is there any specific reason to trigger Full GC every
>>> 60min(-Dsun.rmi.dgc.client.gcInterval=360)? I think that's
>>> overkill.
>> 
>> That’s a value I saw “everywhere” as suggested value (the default is too 
>> short). Note that on the GC logs I don’t see full GCs every hour so maybe 
>> that option is not triggering full gc ?
> 
> That's is the default value. Please see:
> http://docs.oracle.com/javase/8/docs/technotes/guides/rmi/sunrmiproperties.html
>  
> 

They must have changed it recently, it was 180k when I first checked that 
option. Do you think it’s worth to increase it ? I don’t see a full gc every 
hour…

Thanks



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-30 Thread Ing. Andrea Vettori

> On 30 Aug 2017, at 01:13, Caldarale, Charles R  
> wrote:
> 
> Is it possible that the system is getting into swapping?  The heap has been
> set to 20 GiB, but I didn't see any mention of how much actual memory the
> system has.  Do you really need a 20 GiB heap?  Sometimes smaller is better.
> Might also want to try turning off UseHugeTLBFS.

No swap is used on the server; it has 32GB of ram. I think the heap usage is at 
most 15/16GB… in the past I tried various heap sizes and I think this size 
works well.

> 
> I wonder if there are heap objects tied up due to dead but not collected
> metaspace items; when metaspace is GC'd, the heap usage might well shrink
> also.  (This might be a G1GC bug, but that seems unlikely.)  Perhaps the RMI
> mechanism (which I haven't looked at in many years) is searching an
> ever-growing set of soft/weak references that are removed when metaspace is
> collected.  A smaller heap might help there.

Thanks for your suggestions
-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-30 Thread Ing. Andrea Vettori

> On 29 Aug 2017, at 14:24, Mark Thomas  wrote:
> 
> On 29/08/17 13:09, Ing. Andrea Vettori wrote:
>>> On 29 Aug 2017, at 12:29, Suvendu Sekhar Mondal  wrote:
>>> 
>>> On Tue, Aug 29, 2017 at 2:54 PM, Ing. Andrea Vettori
>>>  wrote:
 - with a fresh started tomcat instance, the time it takes is around 0,8 
 seconds. Most of the time is spent on the two RMI calls the task does.
 - with an instance that is running from some time, the time can reach 2/3 
 seconds; occasionally 5/6 seconds. Most time is still spent on RMI calls. 
 I.e. what slows down are the RMI calls.
>>> 
>>> Sill question, do you mean RMI calls generating from Tomcat is getting
>>> slower with time? or, JBoss is taking time to return the response?
>> 
>> Thanks for your help.
>> What I see is that the http requests are slower. Looking at the details of 
>> that specific request (putting a product in the cart) using System.nanotime 
>> I can see that most of the time is spent during the RMI calls. I’m pretty 
>> sure it’s not jboss that is slower because doing other calls at the same 
>> time with fresh tomcats or with a java client they're not slow.
> 
> I'd try profiling the problematic code. Given the magnitude of the
> times, sampling should be sufficient which means you could do this on
> production when the problem happens. You'll probably need to configure
> it to log JVM internal calls.
> 
> I use YourKit (because they give me a free license to use to develop
> Tomcat). Other profilers are available.
> 
> Mark


Thank you, I’m going to try this soon and report back what I find.
-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-30 Thread Ing. Andrea Vettori
> On 30 Aug 2017, at 00:16, Christopher Schultz  
> wrote:
> 
>>> 
>>> Another thing, is there any way you can measure the processing
>>> time in JBoss for each RMI calls?
>>> 
>> 
>> At the moment I’m not measuring the time on the jboss side but I’m
>> considering using a telemetry product in the future. I’m measuring
>> the time on the client side (tomcat or fat client).
> 
> I think it's critical to get metrics on the JBoss side, otherwise you
> don't even know where the problem might be. It's good information to
> have that the application running within Tomcat performs well
> immediately following a metaspace collection

this and the fact that other “clients” are not slow

> , but it's not enough
> information to inform a good decision.
> 
> RMI is known for flagrantly wasting permgen/metaspace because of all
> the Proxy objects and special throw-away Class objects it uses, etc.
> But I'm not sure why the metaspace filling up would have such a
> dramatic effect on performance. If you were stalling during a
> metaspace collection, it might make sense, but that's not what you are
> reporting.

That’s true in fact what I’d like to understand is if a big metaspace can 
affect performance… one thing I want to try is to run the app with metaspace 
capped at 400/500MB. On java 7 I had PermGen at 512 and as far as I remember, 
it was working fine.

> It's like saying that HTTP traffic slows down on your PC
> until you empty your Recycle Bin.
> 
> At any rate, this is not a problem with Tomcat itself: this is likely
> entirely JVM-related. I'm sure everyone is happy to discuss it, here
> (on a Tomcat-specific mailing list), but there aren't any
> Tomcat-related settings that you will be able to tweak that affect
> your environment in any way.


Thank you for your suggestions!
-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-30 Thread Suvendu Sekhar Mondal
Andrea,

On Tue, Aug 29, 2017 at 5:39 PM, Ing. Andrea Vettori
 wrote:
>> On 29 Aug 2017, at 12:29, Suvendu Sekhar Mondal  wrote:
>>
>> On Tue, Aug 29, 2017 at 2:54 PM, Ing. Andrea Vettori
>>  wrote:
>>> - with a fresh started tomcat instance, the time it takes is around 0,8 
>>> seconds. Most of the time is spent on the two RMI calls the task does.
>>> - with an instance that is running from some time, the time can reach 2/3 
>>> seconds; occasionally 5/6 seconds. Most time is still spent on RMI calls. 
>>> I.e. what slows down are the RMI calls.
>>
>> Sill question, do you mean RMI calls generating from Tomcat is getting
>> slower with time? or, JBoss is taking time to return the response?
>
> Thanks for your help.
> What I see is that the http requests are slower. Looking at the details of 
> that specific request (putting a product in the cart) using System.nanotime I 
> can see that most of the time is spent during the RMI calls. I’m pretty sure 
> it’s not jboss that is slower because doing other calls at the same time with 
> fresh tomcats or with a java client they're not slow.
>

I will still suggest to monitor JBoss closely because it is important
part of your workflow. Please check GC stats also.

>>
>>> - restarting the jvm fixes the issue
>>> - ***it seems*** but I’m still testing this since it seems there’s no 
>>> ‘meatspace gc trigger command available', that when Metaspace is garbage 
>>> collected, tomcat then performs like a fresh instance.
>>>
>>
>> Yes, no command to only garbage collect Metaspace area. Without
>> recycling Tomcat, you can trigger a Full GC either using VisualVM or
>> jcmd's GC.run command.
>
> I think that explicit GC is not doing a metaspace GC… that’s what I observed 
> starting the GC with the java console.
>

In Hotspot VM, Full GC will collect "dead" objects/classes from all
the regions(Young, Old, Metaspace/PermGen). In the provided GC log, I
can see that you had triggered Full GCs but Metaspace did not shrunk
much. That clearly indicates you had good number of classes that were
"in use":

48318.001: [Full GC (Heap Dump Initiated GC)  5232M->1298M(20G), 4.9784080 secs]
   [Eden: 2800.0M(3136.0M)->0.0B(9616.0M) Survivors: 120.0M->0.0B
Heap: 5232.7M(20.0G)->1298.9M(20.0G)], [Metaspace:
1419958K->1398648K(2621440K)]
 [Times: user=9.55 sys=0.09, real=4.98 secs]
48404.270: [Full GC (System.gc())  2825M->1301M(20G), 4.3863751 secs]
   [Eden: 1528.0M(9616.0M)->0.0B(9616.0M) Survivors: 0.0B->0.0B Heap:
2825.2M(20.0G)->1301.6M(20.0G)], [Metaspace:
1425628K->1425623K(2627584K)]
 [Times: user=9.09 sys=0.15, real=4.38 secs]

Sometimes Full GC were able to clean up some space from Metaspace but
only as part of a final last ditch collection effort:

43618.504: [Full GC (Last ditch collection)  1386M->250M(20G), 1.6455823 secs]
   [Eden: 0.0B(6408.0M)->0.0B(6408.0M) Survivors: 0.0B->0.0B Heap:
1386.7M(20.0G)->250.5M(20.0G)], [Metaspace:
1646471K->163253K(1843200K)]
 [Times: user=2.23 sys=0.10, real=1.65 secs]
49034.392: [Full GC (Last ditch collection)  1491M->347M(20G), 1.9965534 secs]
   [Eden: 0.0B(5600.0M)->0.0B(5600.0M) Survivors: 0.0B->0.0B Heap:
1491.1M(20.0G)->347.9M(20.0G)], [Metaspace:
1660804K->156199K(2031616K)]
 [Times: user=2.78 sys=0.10, real=2.00 secs]

>>
>>> Since we’re using more than one tomcat instance (2 in production for this 
>>> website, 1 for development) I can see that the issue is isolated to Tomcat 
>>> or the JVM/Host where it runs because other Tomcat instances behave well at 
>>> the same time one is slow. The same JBoss/Postgres backend is used by java 
>>> batches and fat clients and it does work well with consistent times.
>>>
>>> To clarify: the moment one production tomcat that is running from some time 
>>> finishes the task in 3 seconds, the development tomcat or a fresh started 
>>> production tomcat instance does the same task in less that one second. Note 
>>> that repeating the task gives always consistent results, i.e. the instance 
>>> is running from some time is always slow,  the fresh running instance is 
>>> always fast.
>>>
>>> Tomcat is running with these VM options:
>>>
>>> -Xms20480m -Xmx20480m -Dsun.rmi.dgc.client.gcInterval=360 
>>> -Dsun.rmi.dgc.server.gcInterval=360 -Xloggc:/tmp/gc.txt 
>>> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9012 
>>> -Dcom.sun.management.jmxremote.ssl=false 
>>> -Dcom.sun.management.jmxremote.authenticate=false -XX:+PrintGCDetails 
>>> -XX:+PrintGCTimeStamps -XX:+UseG1GC -XX:ReservedCodeCacheSize=1g 
>>> -XX:InitialCodeCacheSize=256m -XX:+UseHugeTLBFS -XX:MetaspaceSize=1g 
>>> -XX:MaxMetaspaceSize=2g
>>>
>>> Some of the options have been recently added (for example the increase in 
>>> code cache  size) but it seems they had no impact on the issue.
>>>
>>
>> Is there any specific reason to trigger Full GC every
>> 60min(-Dsun.rmi.dgc.client.gcInterval=360)? I think that's
>> overkill.
>
> That’s a value I saw “everywhere” as suggested value (the default is too 

RE: Performance issue 8.5.20 (metaspace related?)

2017-08-29 Thread Caldarale, Charles R
> From: Christopher Schultz [mailto:ch...@christopherschultz.net] 
> Subject: Re: Performance issue 8.5.20 (metaspace related?)
 
> > > That sounds like a healthy behavior to me. That means GC is
> > > occurring and Metaspace is getting cleaned up. Why do you think
> > > Metaspace is the problem?

> > Because I can observe that when the metaspace is collected the
> > requests become fast. I observer that a few hours ago, looking at
> > the metaspace graph of the java console and doing requests just
> > after the collect.

> RMI is known for flagrantly wasting permgen/metaspace because of all
> the Proxy objects and special throw-away Class objects it uses, etc.
> But I'm not sure why the metaspace filling up would have such a
> dramatic effect on performance.

> At any rate, this is not a problem with Tomcat itself: this is likely
> entirely JVM-related.

Is it possible that the system is getting into swapping?  The heap has been
set to 20 GiB, but I didn't see any mention of how much actual memory the
system has.  Do you really need a 20 GiB heap?  Sometimes smaller is better.
Might also want to try turning off UseHugeTLBFS.

I wonder if there are heap objects tied up due to dead but not collected
metaspace items; when metaspace is GC'd, the heap usage might well shrink
also.  (This might be a G1GC bug, but that seems unlikely.)  Perhaps the RMI
mechanism (which I haven't looked at in many years) is searching an
ever-growing set of soft/weak references that are removed when metaspace is
collected.  A smaller heap might help there.

 - Chuck


THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY
MATERIAL and is thus for use only by the intended recipient. If you received
this in error, please contact the sender and delete the e-mail and its
attachments from all computers.



smime.p7s
Description: S/MIME cryptographic signature


Re: Performance issue 8.5.20 (metaspace related?)

2017-08-29 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Andrea,

On 8/29/17 8:09 AM, Ing. Andrea Vettori wrote:
>> On 29 Aug 2017, at 12:29, Suvendu Sekhar Mondal
>>  wrote:
>> 
>> On Tue, Aug 29, 2017 at 2:54 PM, Ing. Andrea Vettori 
>>  wrote:
>>> - with a fresh started tomcat instance, the time it takes is
>>> around 0,8 seconds. Most of the time is spent on the two RMI
>>> calls the task does. - with an instance that is running from
>>> some time, the time can reach 2/3 seconds; occasionally 5/6
>>> seconds. Most time is still spent on RMI calls. I.e. what slows
>>> down are the RMI calls.
>> 
>> Sill question, do you mean RMI calls generating from Tomcat is
>> getting slower with time? or, JBoss is taking time to return the
>> response?
> 
> Thanks for your help. What I see is that the http requests are
> slower. Looking at the details of that specific request (putting a
> product in the cart) using System.nanotime I can see that most of
> the time is spent during the RMI calls. I’m pretty sure it’s not
> jboss that is slower because doing other calls at the same time
> with fresh tomcats or with a java client they're not slow.
> 
>> 
>>> - restarting the jvm fixes the issue - ***it seems*** but I’m
>>> still testing this since it seems there’s no ‘meatspace gc
>>> trigger command available', that when Metaspace is garbage
>>> collected, tomcat then performs like a fresh instance.
>>> 
>> 
>> Yes, no command to only garbage collect Metaspace area. Without 
>> recycling Tomcat, you can trigger a Full GC either using VisualVM
>> or jcmd's GC.run command.
> 
> I think that explicit GC is not doing a metaspace GC… that’s what I
> observed starting the GC with the java console.
> 
>> 
>>> Since we’re using more than one tomcat instance (2 in
>>> production for this website, 1 for development) I can see that
>>> the issue is isolated to Tomcat or the JVM/Host where it runs
>>> because other Tomcat instances behave well at the same time one
>>> is slow. The same JBoss/Postgres backend is used by java
>>> batches and fat clients and it does work well with consistent
>>> times.
>>> 
>>> To clarify: the moment one production tomcat that is running
>>> from some time finishes the task in 3 seconds, the development
>>> tomcat or a fresh started production tomcat instance does the
>>> same task in less that one second. Note that repeating the task
>>> gives always consistent results, i.e. the instance is running
>>> from some time is always slow,  the fresh running instance is
>>> always fast.
>>> 
>>> Tomcat is running with these VM options:
>>> 
>>> -Xms20480m -Xmx20480m -Dsun.rmi.dgc.client.gcInterval=360
>>> -Dsun.rmi.dgc.server.gcInterval=360 -Xloggc:/tmp/gc.txt
>>> -Dcom.sun.management.jmxremote
>>> -Dcom.sun.management.jmxremote.port=9012
>>> -Dcom.sun.management.jmxremote.ssl=false
>>> -Dcom.sun.management.jmxremote.authenticate=false
>>> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseG1GC
>>> -XX:ReservedCodeCacheSize=1g -XX:InitialCodeCacheSize=256m
>>> -XX:+UseHugeTLBFS -XX:MetaspaceSize=1g -XX:MaxMetaspaceSize=2g
>>> 
>>> Some of the options have been recently added (for example the
>>> increase in code cache  size) but it seems they had no impact
>>> on the issue.
>>> 
>> 
>> Is there any specific reason to trigger Full GC every 
>> 60min(-Dsun.rmi.dgc.client.gcInterval=360)? I think that's 
>> overkill.
> 
> That’s a value I saw “everywhere” as suggested value (the default
> is too short). Note that on the GC logs I don’t see full GCs every
> hour so maybe that option is not triggering full gc ?
> 
> 
>>> Metaspace goes up to 1,6GB before being collected. Value after
>>> garbage collect is around 200MB. Heap usage is variable, it
>>> usually stays under 10G and is around 1G after garbage
>>> collect. CPU usage rarely goes over 10%. Loaded classes between
>>> 20k and 40k. Active sessions around 100/120 for each instance.
>>> 
>> 
>> That sounds like a healthy behavior to me. That means GC is
>> occurring and Metaspace is getting cleaned up. Why do you think
>> Metaspace is the problem?
> 
> Because I can observe that when the metaspace is collected the
> requests become fast. I observer that a few hours ago, looking at
> the metaspace graph of the java console and doing requests just
> after the collect.
> 
>> From the given data, my first impression is JVM is not having
>> any memory pressure. If you can post your gc.txt file to this
>> group, I would be happy to check it.
> 
> Thanks, here’s a link for gceasy report:
> http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMDgvMjkvLS1nY3N0YX
RzLnR4dC5nei0tMTEtNDktMzQ=
> 
>
> 
and the txt log file: http://ge.tt/1Un0DLm2
> Please consider that I triggered a Full GC manually two times on
> this server.
> 
> 
>> 
>> Another thing, is there any way you can measure the processing
>> time in JBoss for each RMI calls?
>> 
> 
> At the moment I’m not measu

Re: Performance issue 8.5.20 (metaspace related?)

2017-08-29 Thread Mark Thomas
On 29/08/17 13:09, Ing. Andrea Vettori wrote:
>> On 29 Aug 2017, at 12:29, Suvendu Sekhar Mondal  wrote:
>>
>> On Tue, Aug 29, 2017 at 2:54 PM, Ing. Andrea Vettori
>>  wrote:
>>> - with a fresh started tomcat instance, the time it takes is around 0,8 
>>> seconds. Most of the time is spent on the two RMI calls the task does.
>>> - with an instance that is running from some time, the time can reach 2/3 
>>> seconds; occasionally 5/6 seconds. Most time is still spent on RMI calls. 
>>> I.e. what slows down are the RMI calls.
>>
>> Sill question, do you mean RMI calls generating from Tomcat is getting
>> slower with time? or, JBoss is taking time to return the response?
> 
> Thanks for your help.
> What I see is that the http requests are slower. Looking at the details of 
> that specific request (putting a product in the cart) using System.nanotime I 
> can see that most of the time is spent during the RMI calls. I’m pretty sure 
> it’s not jboss that is slower because doing other calls at the same time with 
> fresh tomcats or with a java client they're not slow.

I'd try profiling the problematic code. Given the magnitude of the
times, sampling should be sufficient which means you could do this on
production when the problem happens. You'll probably need to configure
it to log JVM internal calls.

I use YourKit (because they give me a free license to use to develop
Tomcat). Other profilers are available.

Mark


-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-29 Thread Ing. Andrea Vettori
> On 29 Aug 2017, at 12:29, Suvendu Sekhar Mondal  wrote:
> 
> On Tue, Aug 29, 2017 at 2:54 PM, Ing. Andrea Vettori
>  wrote:
>> - with a fresh started tomcat instance, the time it takes is around 0,8 
>> seconds. Most of the time is spent on the two RMI calls the task does.
>> - with an instance that is running from some time, the time can reach 2/3 
>> seconds; occasionally 5/6 seconds. Most time is still spent on RMI calls. 
>> I.e. what slows down are the RMI calls.
> 
> Sill question, do you mean RMI calls generating from Tomcat is getting
> slower with time? or, JBoss is taking time to return the response?

Thanks for your help.
What I see is that the http requests are slower. Looking at the details of that 
specific request (putting a product in the cart) using System.nanotime I can 
see that most of the time is spent during the RMI calls. I’m pretty sure it’s 
not jboss that is slower because doing other calls at the same time with fresh 
tomcats or with a java client they're not slow.

> 
>> - restarting the jvm fixes the issue
>> - ***it seems*** but I’m still testing this since it seems there’s no 
>> ‘meatspace gc trigger command available', that when Metaspace is garbage 
>> collected, tomcat then performs like a fresh instance.
>> 
> 
> Yes, no command to only garbage collect Metaspace area. Without
> recycling Tomcat, you can trigger a Full GC either using VisualVM or
> jcmd's GC.run command.

I think that explicit GC is not doing a metaspace GC… that’s what I observed 
starting the GC with the java console.

> 
>> Since we’re using more than one tomcat instance (2 in production for this 
>> website, 1 for development) I can see that the issue is isolated to Tomcat 
>> or the JVM/Host where it runs because other Tomcat instances behave well at 
>> the same time one is slow. The same JBoss/Postgres backend is used by java 
>> batches and fat clients and it does work well with consistent times.
>> 
>> To clarify: the moment one production tomcat that is running from some time 
>> finishes the task in 3 seconds, the development tomcat or a fresh started 
>> production tomcat instance does the same task in less that one second. Note 
>> that repeating the task gives always consistent results, i.e. the instance 
>> is running from some time is always slow,  the fresh running instance is 
>> always fast.
>> 
>> Tomcat is running with these VM options:
>> 
>> -Xms20480m -Xmx20480m -Dsun.rmi.dgc.client.gcInterval=360 
>> -Dsun.rmi.dgc.server.gcInterval=360 -Xloggc:/tmp/gc.txt 
>> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9012 
>> -Dcom.sun.management.jmxremote.ssl=false 
>> -Dcom.sun.management.jmxremote.authenticate=false -XX:+PrintGCDetails 
>> -XX:+PrintGCTimeStamps -XX:+UseG1GC -XX:ReservedCodeCacheSize=1g 
>> -XX:InitialCodeCacheSize=256m -XX:+UseHugeTLBFS -XX:MetaspaceSize=1g 
>> -XX:MaxMetaspaceSize=2g
>> 
>> Some of the options have been recently added (for example the increase in 
>> code cache  size) but it seems they had no impact on the issue.
>> 
> 
> Is there any specific reason to trigger Full GC every
> 60min(-Dsun.rmi.dgc.client.gcInterval=360)? I think that's
> overkill.

That’s a value I saw “everywhere” as suggested value (the default is too 
short). Note that on the GC logs I don’t see full GCs every hour so maybe that 
option is not triggering full gc ?


>> Metaspace goes up to 1,6GB before being collected. Value after garbage 
>> collect is around 200MB. Heap usage is variable, it usually stays under 10G 
>> and is around 1G after garbage collect.
>> CPU usage rarely goes over 10%. Loaded classes between 20k and 40k. Active 
>> sessions around 100/120 for each instance.
>> 
> 
> That sounds like a healthy behavior to me. That means GC is occurring
> and Metaspace is getting cleaned up. Why do you think Metaspace is the
> problem?

Because I can observe that when the metaspace is collected the requests become 
fast. I observer that a few hours ago, looking at the metaspace graph of the 
java console and doing requests just after the collect. 

> From the given data, my first impression is JVM is not having any
> memory pressure. If you can post your gc.txt file to this group, I
> would be happy to check it.

Thanks, here’s a link for gceasy report: 
http://gceasy.io/my-gc-report.jsp?p=c2hhcmVkLzIwMTcvMDgvMjkvLS1nY3N0YXRzLnR4dC5nei0tMTEtNDktMzQ=
 

and the txt log file: http://ge.tt/1Un0DLm2
Please consider that I triggered a Full GC manually two times on this server.


> 
> Another thing, is there any way you can measure the processing time in
> JBoss for each RMI calls?
> 

At the moment I’m not measuring the time on the jboss side but I’m considering 
using a telemetry product in the future. I’m measuring the time on the client 
side (tomcat or fat client).

Thank you!
—
Andrea Vettori



Re: Performance issue 8.5.20 (metaspace related?)

2017-08-29 Thread Suvendu Sekhar Mondal
Hello,

On Tue, Aug 29, 2017 at 2:54 PM, Ing. Andrea Vettori
 wrote:
> Hello,
> W're running several instances of Tomcat 8.5.20 / JDK8.144 / CentOs7 on our 
> company for various web sites in many hosts. Recently I’m trying to 
> understand a performance problem we’re having on our e-commerce web site.
> The configuration is the following
>
> HAProxy   <—> 2x Tomcat 8.5.20 <—>  JBoss 5.1 EJB <—> Postgres 9.6
>
> Tomcat runs a web site built with Struts / Freemarker that does call JBoss 
> EJBs with RMI.
> Monitoring a specific task (putting a product on the cart) I see the 
> following :
>
> - with a fresh started tomcat instance, the time it takes is around 0,8 
> seconds. Most of the time is spent on the two RMI calls the task does.
> - with an instance that is running from some time, the time can reach 2/3 
> seconds; occasionally 5/6 seconds. Most time is still spent on RMI calls. 
> I.e. what slows down are the RMI calls.

Sill question, do you mean RMI calls generating from Tomcat is getting
slower with time? or, JBoss is taking time to return the response?

> - restarting the jvm fixes the issue
> - ***it seems*** but I’m still testing this since it seems there’s no 
> ‘meatspace gc trigger command available', that when Metaspace is garbage 
> collected, tomcat then performs like a fresh instance.
>

Yes, no command to only garbage collect Metaspace area. Without
recycling Tomcat, you can trigger a Full GC either using VisualVM or
jcmd's GC.run command.

> Since we’re using more than one tomcat instance (2 in production for this 
> website, 1 for development) I can see that the issue is isolated to Tomcat or 
> the JVM/Host where it runs because other Tomcat instances behave well at the 
> same time one is slow. The same JBoss/Postgres backend is used by java 
> batches and fat clients and it does work well with consistent times.
>
> To clarify: the moment one production tomcat that is running from some time 
> finishes the task in 3 seconds, the development tomcat or a fresh started 
> production tomcat instance does the same task in less that one second. Note 
> that repeating the task gives always consistent results, i.e. the instance is 
> running from some time is always slow,  the fresh running instance is always 
> fast.
>
> Tomcat is running with these VM options:
>
> -Xms20480m -Xmx20480m -Dsun.rmi.dgc.client.gcInterval=360 
> -Dsun.rmi.dgc.server.gcInterval=360 -Xloggc:/tmp/gc.txt 
> -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9012 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dcom.sun.management.jmxremote.authenticate=false -XX:+PrintGCDetails 
> -XX:+PrintGCTimeStamps -XX:+UseG1GC -XX:ReservedCodeCacheSize=1g 
> -XX:InitialCodeCacheSize=256m -XX:+UseHugeTLBFS -XX:MetaspaceSize=1g 
> -XX:MaxMetaspaceSize=2g
>
> Some of the options have been recently added (for example the increase in 
> code cache  size) but it seems they had no impact on the issue.
>

Is there any specific reason to trigger Full GC every
60min(-Dsun.rmi.dgc.client.gcInterval=360)? I think that's
overkill.


> Metaspace goes up to 1,6GB before being collected. Value after garbage 
> collect is around 200MB. Heap usage is variable, it usually stays under 10G 
> and is around 1G after garbage collect.
> CPU usage rarely goes over 10%. Loaded classes between 20k and 40k. Active 
> sessions around 100/120 for each instance.
>

That sounds like a healthy behavior to me. That means GC is occurring
and Metaspace is getting cleaned up. Why do you think Metaspace is the
problem?
>From the given data, my first impression is JVM is not having any
memory pressure. If you can post your gc.txt file to this group, I
would be happy to check it.

Another thing, is there any way you can measure the processing time in
JBoss for each RMI calls?

> Any help or direction to understand what’s causing this is greatly 
> appreciated.
> Thank you
> —
> Ing. Andrea Vettori
>

Thanks!
Suvendu

-
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org



Performance issue 8.5.20 (metaspace related?)

2017-08-29 Thread Ing. Andrea Vettori
Hello,
W're running several instances of Tomcat 8.5.20 / JDK8.144 / CentOs7 on our 
company for various web sites in many hosts. Recently I’m trying to understand 
a performance problem we’re having on our e-commerce web site.
The configuration is the following

HAProxy   <—> 2x Tomcat 8.5.20 <—>  JBoss 5.1 EJB <—> Postgres 9.6

Tomcat runs a web site built with Struts / Freemarker that does call JBoss EJBs 
with RMI.
Monitoring a specific task (putting a product on the cart) I see the following :

- with a fresh started tomcat instance, the time it takes is around 0,8 
seconds. Most of the time is spent on the two RMI calls the task does.
- with an instance that is running from some time, the time can reach 2/3 
seconds; occasionally 5/6 seconds. Most time is still spent on RMI calls. I.e. 
what slows down are the RMI calls.
- restarting the jvm fixes the issue
- ***it seems*** but I’m still testing this since it seems there’s no 
‘meatspace gc trigger command available', that when Metaspace is garbage 
collected, tomcat then performs like a fresh instance.

Since we’re using more than one tomcat instance (2 in production for this 
website, 1 for development) I can see that the issue is isolated to Tomcat or 
the JVM/Host where it runs because other Tomcat instances behave well at the 
same time one is slow. The same JBoss/Postgres backend is used by java batches 
and fat clients and it does work well with consistent times.

To clarify: the moment one production tomcat that is running from some time 
finishes the task in 3 seconds, the development tomcat or a fresh started 
production tomcat instance does the same task in less that one second. Note 
that repeating the task gives always consistent results, i.e. the instance is 
running from some time is always slow,  the fresh running instance is always 
fast.

Tomcat is running with these VM options: 

-Xms20480m -Xmx20480m -Dsun.rmi.dgc.client.gcInterval=360 
-Dsun.rmi.dgc.server.gcInterval=360 -Xloggc:/tmp/gc.txt 
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=9012 
-Dcom.sun.management.jmxremote.ssl=false 
-Dcom.sun.management.jmxremote.authenticate=false -XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps -XX:+UseG1GC -XX:ReservedCodeCacheSize=1g 
-XX:InitialCodeCacheSize=256m -XX:+UseHugeTLBFS -XX:MetaspaceSize=1g 
-XX:MaxMetaspaceSize=2g

Some of the options have been recently added (for example the increase in code 
cache  size) but it seems they had no impact on the issue.

Metaspace goes up to 1,6GB before being collected. Value after garbage collect 
is around 200MB. Heap usage is variable, it usually stays under 10G and is 
around 1G after garbage collect.
CPU usage rarely goes over 10%. Loaded classes between 20k and 40k. Active 
sessions around 100/120 for each instance.

Any help or direction to understand what’s causing this is greatly appreciated.
Thank you
—
Ing. Andrea Vettori