Hi all,

a few of you also replied by private mail, thanks. Supplying the -Xloggc option 
worked, although I don't understand where the info went before, and the log 
rotation option doesn't rotate anything but just truncates the log file on 
every JVM restart, which sucks. But ok, I have some data.

Below you find the log of one session. Users reported bad response times that 
correspond with the Full GC runs, but also very bad response times from 17:00 
onward where there is often no GC activity logged at the times that users 
experienced waiting time. So I'm not entirely sure if this is all caused by GC.

Anyway, heap sizes of up to 60 GByte are apparently not so common, at least not 
with WebObjects. I found a helpful article [1] about very large JVM heaps, 
including the hint to the Zing Azul JVM, which features a stopless garbage 
collector [3], but costs $8000 per server and year. In the end, we may actually 
be trying that if other options fail, but first I need to understand more about 
what's in memory. I'm currently approaching that question via JProfiler, 
especially to find out if that's all snapshot cache (how would I limit that in 
size?) or some other homemade memory leak.

Does anyone have experience with the combination of Azul and WO?

I also found one strange event today where the application appeared stuck, but 
continued after a minute or two, while the gc log file was truncated at that 
time, without the JVM restarting. I can't see how that's supposed to happen. 
Here's how that looked in my shell running "tail -F gc.log":


[PSYoungGen: 15256082K->3431947K(16962048K)] 25580475K->13801928K(27895808K), 
1.1505415 secs] [Times: user=6.00 sys=0.03, real=1.15 secs]
2016-01-26T12:11:38.640+0100: 26136.407: [GC (Allocation Failure)
Desired survivor size 4679794688 bytes, new threshold 15 (max 15)
[PSYoungGen: 15791627K->3414349K(16958976K)] 26161608K->13935770K(27892736K), 
1.0761646 secs] [Times: user=6.10 sys=0.17, real=1.08 secs]
2016-01-26T12:16:10.835+0100: 26408.603: [GC (Allocation Failure)
Desired survivor size 4657250304 bytes, new threshold 15 (max 15)
[PSYoungGen: 15774029K->3449938K(17397760K)] 26295450K->14128338K(28331520K), 
1.3285101 secs] [Times: user=6.24 sys=1.39, real=1.33 secs]
2016-01-26T12:16:12.164+0100: 26409.931: [Full GC (Ergonomics) [PSYoungGen: 
3449938K->2355719K(17397760K)] [ParOldGen: 10678400K->10933508K(13649408K)] 
14128338K->13289227K(31047168K), [Metaspace: 84345K->84345K(94208K)], 
12.1170523 secs] [Times: user=67.98 sys=1.03, real=12.11 secs]
2016-01-26T12:21:18.990+0100: 26716.757: [GC (Allocation Failure)
Desired survivor size 4644667392 bytes, new threshold 15 (max 15)
[PSYoungGen: 15218183K->2398851K(17410560K)] 26151691K->13458637K(31059968K), 
1.3389149 secs] [Times: user=8.72 sys=0.17, real=1.34 secs]
tail: gc.log: file truncated
Java HotSpot(TM) 64-Bit Server VM (25.66-b17) for linux-amd64 JRE 
(1.8.0_66-b17), built on Oct  6 2015 17:28:34 by "java_re" with gcc 4.3.0 
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 99008820k(13655516k free), swap 67108860k(66990156k 
free)
CommandLine flags: -XX:InitialHeapSize=1584141120 -XX:+ManagementServer 
-XX:MaxHeapSize=90194313216 -XX:+PrintGC -XX:+PrintGCDateStamps 
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution 
-XX:-UseGCLogFileRotation -XX:+UseParallelGC
2016-01-26T12:33:36.029+0100: 27453.796: [GC (Allocation Failure)
Desired survivor size 4588568576 bytes, new threshold 15 (max 15)
[PSYoungGen: 15261315K->2538940K(17720832K)] 26321101K->13670347K(31370240K), 
0.7936427 secs] [Times: user=4.62 sys=0.03, real=0.80 secs]
2016-01-26T12:39:29.156+0100: 27806.923: [GC (Allocation Failure)


And here's the above-mentioned full gc log of 4 hours during the most critical 
usage time where users experienced waits. Notable wait times according to 
user's notes occurred at least at the following times: 15:07, 16:05, 16:57, 
17:00-17:35, 17:59.


2016-01-25T15:06:54.344+0100: 36634.642: [GC (Allocation Failure) [PSYoungGen: 
18701268K->2058545K(23079424K)] 33613120K->17079754K(41686016K), 0.8635088 
secs] [Times: user=4.22 sys=0.50, real=0.86 secs] 
2016-01-25T15:15:48.438+0100: 37168.736: [GC (Allocation Failure) [PSYoungGen: 
19683633K->2231668K(23157248K)] 34704842K->17370085K(41763840K), 0.9898285 
secs] [Times: user=4.98 sys=0.05, real=0.99 secs] 
2016-01-25T15:24:54.891+0100: 37715.189: [GC (Allocation Failure) [PSYoungGen: 
19856756K->2254238K(23623168K)] 34995173K->17626274K(42229760K), 1.0030974 
secs] [Times: user=4.91 sys=0.30, real=1.00 secs] 
2016-01-25T15:33:58.784+0100: 38259.082: [GC (Allocation Failure) [PSYoungGen: 
20552606K->2502793K(23740416K)] 35924642K->17914660K(42347008K), 1.0203825 
secs] [Times: user=5.32 sys=0.24, real=1.02 secs] 
2016-01-25T15:43:12.959+0100: 38813.256: [GC (Allocation Failure) [PSYoungGen: 
20801161K->2677482K(23608320K)] 36213028K->18138909K(42214912K), 1.2300001 
secs] [Times: user=5.53 sys=0.22, real=1.23 secs] 
2016-01-25T15:51:16.686+0100: 39296.983: [GC (Allocation Failure) [PSYoungGen: 
21276394K->2800542K(23760384K)] 36737821K->18411609K(42366976K), 1.1514720 
secs] [Times: user=6.19 sys=0.15, real=1.15 secs] 
2016-01-25T15:58:35.696+0100: 39735.994: [GC (Allocation Failure) [PSYoungGen: 
21399454K->2938402K(23849472K)] 37010521K->18700024K(42456064K), 2.8563665 
secs] [Times: user=6.23 sys=8.99, real=2.86 secs] 
2016-01-25T16:05:33.310+0100: 40153.607: [GC (System.gc()) [PSYoungGen: 
20526288K->2986997K(22197760K)] 36287910K->18889778K(40804352K), 1.2626979 
secs] [Times: user=6.22 sys=1.00, real=1.27 secs] 
2016-01-25T16:05:34.572+0100: 40154.870: [Full GC (System.gc()) [PSYoungGen: 
2986997K->0K(22197760K)] [ParOldGen: 15902781K->16935926K(18606592K)] 
18889778K->16935926K(40804352K), [Metaspace: 90226K->90226K(100352K)], 
15.0685505 secs] [Times: user=86.67 sys=7.55, real=15.06 secs] 
2016-01-25T16:13:35.766+0100: 40636.064: [GC (Allocation Failure) [PSYoungGen: 
19210752K->419289K(23559168K)] 36146678K->17355224K(42165760K), 0.5276242 secs] 
[Times: user=2.41 sys=0.00, real=0.53 secs] 
2016-01-25T16:22:46.735+0100: 41187.033: [GC (Allocation Failure) [PSYoungGen: 
19353561K->740457K(23835648K)] 36289496K->17676391K(42442240K), 0.6634034 secs] 
[Times: user=2.52 sys=0.00, real=0.66 secs] 
2016-01-25T16:30:31.503+0100: 41651.801: [GC (Allocation Failure) [PSYoungGen: 
19674729K->1023572K(24313856K)] 36610663K->17959507K(42920448K), 0.8099960 
secs] [Times: user=2.72 sys=0.08, real=0.81 secs] 
2016-01-25T16:37:51.095+0100: 42091.393: [GC (Allocation Failure) [PSYoungGen: 
20311636K->1263580K(20551680K)] 37247571K->18199515K(39158272K), 0.7992434 
secs] [Times: user=3.32 sys=0.01, real=0.80 secs] 
2016-01-25T16:49:34.671+0100: 42794.969: [GC (Allocation Failure) [PSYoungGen: 
20551644K->1565656K(24408576K)] 37487579K->18501598K(43015168K), 0.8773248 
secs] [Times: user=3.74 sys=0.00, real=0.87 secs] 
2016-01-25T16:56:34.541+0100: 43214.839: [GC (Allocation Failure) [PSYoungGen: 
21011416K->1791681K(21237760K)] 37947358K->18727623K(39844352K), 0.7816826 
secs] [Times: user=4.03 sys=0.00, real=0.79 secs] 
2016-01-25T17:01:31.494+0100: 43511.792: [GC (Allocation Failure) [PSYoungGen: 
21237441K->1997601K(24417280K)] 38173383K->18933551K(43023872K), 0.8769351 
secs] [Times: user=4.45 sys=0.09, real=0.88 secs] 
2016-01-25T17:07:13.856+0100: 43854.154: [GC (Allocation Failure) [PSYoungGen: 
21797153K->2080272K(24207360K)] 38733103K->19016222K(42813952K), 0.8543820 
secs] [Times: user=4.50 sys=0.00, real=0.86 secs] 
2016-01-25T17:12:37.866+0100: 44178.164: [GC (Allocation Failure) [PSYoungGen: 
21879824K->2170000K(24282112K)] 38815774K->19105950K(42888704K), 0.8236605 
secs] [Times: user=4.92 sys=0.00, real=0.82 secs] 
2016-01-25T17:18:05.977+0100: 44506.275: [GC (Allocation Failure) [PSYoungGen: 
22576272K->2299192K(24404992K)] 39512222K->19235143K(43011584K), 0.7759199 
secs] [Times: user=4.77 sys=0.00, real=0.78 secs] 
2016-01-25T17:28:11.288+0100: 45111.586: [GC (Allocation Failure) [PSYoungGen: 
22705464K->2550874K(24838656K)] 39641415K->19486825K(43445248K), 1.3544459 
secs] [Times: user=5.52 sys=1.93, real=1.36 secs] 
2016-01-25T17:35:39.350+0100: 45559.648: [GC (Allocation Failure) [PSYoungGen: 
23578202K->2665756K(24871424K)] 40514153K->19601715K(43478016K), 1.0373681 
secs] [Times: user=5.67 sys=0.04, real=1.04 secs] 
2016-01-25T17:43:31.775+0100: 46032.073: [GC (Allocation Failure) [PSYoungGen: 
23693084K->2891569K(25351168K)] 40629043K->19827535K(43957760K), 2.8873970 
secs] [Times: user=5.28 sys=11.09, real=2.89 secs] 
2016-01-25T17:51:22.218+0100: 46502.516: [GC (Allocation Failure) [PSYoungGen: 
24390449K->3020322K(24519680K)] 41326415K->19956296K(43126272K), 1.0685985 
secs] [Times: user=6.17 sys=0.00, real=1.07 secs] 
2016-01-25T18:00:19.997+0100: 47040.295: [GC (Allocation Failure) [PSYoungGen: 
24519202K->3177498K(25069568K)] 41455176K->20113480K(43676160K), 1.3560052 
secs] [Times: user=6.39 sys=0.34, real=1.36 secs] 
2016-01-25T18:09:14.251+0100: 47574.548: [GC (Allocation Failure) [PSYoungGen: 
24215578K->3059878K(24098304K)] 41151560K->20219258K(42704896K), 1.5082240 
secs] [Times: user=6.62 sys=2.86, real=1.50 secs] 
2016-01-25T18:15:01.783+0100: 47922.080: [GC (Allocation Failure) [PSYoungGen: 
24097958K->2809652K(25209856K)] 41257338K->20232628K(43816448K), 1.8871482 
secs] [Times: user=6.38 sys=6.59, real=1.88 secs] 
2016-01-25T18:24:10.926+0100: 48471.223: [GC (Allocation Failure) [PSYoungGen: 
23806772K->2757610K(23754752K)] 41229748K->20472641K(42361344K), 1.8807114 
secs] [Times: user=6.50 sys=6.05, real=1.88 secs] 
2016-01-25T18:32:11.039+0100: 48951.336: [GC (Allocation Failure) [PSYoungGen: 
23754730K->2664985K(25179136K)] 41469761K->20610485K(43785728K), 1.6722986 
secs] [Times: user=5.99 sys=4.89, real=1.68 secs] 
2016-01-25T18:32:33.693+0100: 48973.991: [GC (System.gc()) [PSYoungGen: 
5342511K->2261788K(25064448K)] 23288012K->20489350K(43671040K), 1.4310250 secs] 
[Times: user=5.08 sys=5.69, real=1.43 secs] 
2016-01-25T18:32:35.124+0100: 48975.422: [Full GC (System.gc()) [PSYoungGen: 
2261788K->373549K(25064448K)] [ParOldGen: 18227562K->18606102K(18606592K)] 
20489350K->18979651K(43671040K), [Metaspace: 94654K->94654K(104448K)], 
17.7869839 secs] [Times: user=109.81 sys=4.10, real=17.79 secs] 
2016-01-25T18:38:01.155+0100: 49301.452: [Full GC (Ergonomics) [PSYoungGen: 
21485357K->0K(25064448K)] [ParOldGen: 18606102K->17803897K(22484992K)] 
40091459K->17803897K(47549440K), [Metaspace: 94728K->94728K(104448K)], 
11.2804123 secs] [Times: user=71.57 sys=1.84, real=11.28 secs] 
2016-01-25T18:49:54.421+0100: 50014.719: [GC (Allocation Failure) [PSYoungGen: 
21111808K->310417K(24994816K)] 38915705K->18114322K(47479808K), 0.4822526 secs] 
[Times: user=1.49 sys=0.00, real=0.49 secs] 
2016-01-25T19:02:11.033+0100: 50751.331: [GC (Allocation Failure) [PSYoungGen: 
21435537K->650681K(21775872K)] 39239442K->18454586K(44260864K), 0.4177362 secs] 
[Times: user=1.93 sys=0.01, real=0.42 secs] 
2016-01-25T19:09:54.731+0100: 51215.028: [GC (Allocation Failure) [PSYoungGen: 
21775801K->1405472K(25216000K)] 39579706K->19209377K(47700992K), 0.6537367 
secs] [Times: user=3.92 sys=0.00, real=0.65 secs] 


As always, I'm thankful for any more hints. Please reply publicly so everyone 
else can benefit from the answers in the future.


Thanks
Maik


[1] 
http://stackoverflow.com/questions/214362/java-very-large-heap-sizes#6340745 
<http://stackoverflow.com/questions/214362/java-very-large-heap-sizes#6340745>
[2] https://www.azul.com/products/zing/ <https://www.azul.com/products/zing/>
[3] http://www.infoq.com/articles/azul_gc_in_detail 
<http://www.infoq.com/articles/azul_gc_in_detail>


> Am 24.01.2016 um 21:01 schrieb Bogdan Zlatanov <bogdan.zlata...@gmail.com>:
> 
> Hi again,
> 
> Try redirecting all the GC output in it's own file with the -Xloggc: 
> /path/to/gc/log option.
> 
> Cheers,
> Bogdan
> 
> On 24/01/2016 20:19, Maik Musall wrote:
>> Hi,
>> 
>> yes, I tried, but the documented log output would never appear anywhere in 
>> my logs. These are the options I'm using:
>> 
>> -verbose:gc
>> -XX:+PrintGCDetails
>> -XX:+PrintGCDateStamps
>> 
>> Despite these settings, I can't see ANYTHING in the logs about GC. I don't 
>> know what I'm doing wrong there?
>> 
>> Maik
>> 
>>> Am 24.01.2016 um 13:27 schrieb Bogdan Zlatanov <bogdan.zlata...@gmail.com> 
>>> <mailto:bogdan.zlata...@gmail.com>:
>>> 
>>> Hi,
>>> 
>>> Have you tried tracing the garbage collection -> 
>>> https://docs.oracle.com/cd/E19159-01/819-3681/abeig/index.html 
>>> <https://docs.oracle.com/cd/E19159-01/819-3681/abeig/index.html>?
>>> 
>>> Cheers,
>>> Bogdan
>>> 
>>>> On 24/01/2016 13:01, Musall Maik wrote:
>>>> Hi,
>>>> 
>>>> I have an application that often becomes laggy and shows slow responses 
>>>> almost every day around 5 pm. The instance has daily scheduled restarts at 
>>>> 5 am, so it occurs after 12 hours of use. When I restart it preemptively 
>>>> at 3 pm, the problem does not occur. When I don't, users suffer bad 
>>>> response times on many requests, but not all.
>>>> 
>>>> Users are computing heavy statistics in this application, instance size 
>>>> (mostly snapshot cache) at that time is usually about 50 GByte, while Xmx 
>>>> is 84 GByte, and XX:MaxPermSize is 1024m. The biggest instance size I've 
>>>> seen so far in production is 61 GByte, so 50 seems not like a limit where 
>>>> something weird happens.
>>>> 
>>>> CPU, memory, filesystem, database and network all show nothing special 
>>>> before and after 5 pm. I monitored the instance with jvisualvm to no 
>>>> avail. I've been searching for the reason for quite some time now, so I'd 
>>>> appreciate any suggestion you might have what else to check.
>>>> 
>>>> Thanks
>>>> Maik
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Do not post admin requests to the list. They will be ignored.
>>>> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com 
>>>> <mailto:Webobjects-dev@lists.apple.com>)
>>>> Help/Unsubscribe/Update your Subscription:
>>>> https://lists.apple.com/mailman/options/webobjects-dev/bogdan.zlatanov%40gmail.com
>>>>  
>>>> <https://lists.apple.com/mailman/options/webobjects-dev/bogdan.zlatanov%40gmail.com>
>>>> 
>>>> This email sent to bogdan.zlata...@gmail.com 
>>>> <mailto:bogdan.zlata...@gmail.com>
>>> _______________________________________________
>>> Do not post admin requests to the list. They will be ignored.
>>> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com 
>>> <mailto:Webobjects-dev@lists.apple.com>)
>>> Help/Unsubscribe/Update your Subscription:
>>> https://lists.apple.com/mailman/options/webobjects-dev/maik%40selbstdenker.ag
>>>  
>>> <https://lists.apple.com/mailman/options/webobjects-dev/maik%40selbstdenker.ag>
>>> 
>>> This email sent to m...@selbstdenker.ag <mailto:m...@selbstdenker.ag>
>>  _______________________________________________
>> Do not post admin requests to the list. They will be ignored.
>> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com 
>> <mailto:Webobjects-dev@lists.apple.com>)
>> Help/Unsubscribe/Update your Subscription:
>> https://lists.apple.com/mailman/options/webobjects-dev/bogdan.zlatanov%40gmail.com
>>  
>> <https://lists.apple.com/mailman/options/webobjects-dev/bogdan.zlatanov%40gmail.com>
>> 
>> This email sent to bogdan.zlata...@gmail.com 
>> <mailto:bogdan.zlata...@gmail.com>
> _______________________________________________
> Do not post admin requests to the list. They will be ignored.
> Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
> Help/Unsubscribe/Update your Subscription:
> https://lists.apple.com/mailman/options/webobjects-dev/maik%40selbstdenker.ag
> 
> This email sent to m...@selbstdenker.ag

 _______________________________________________
Do not post admin requests to the list. They will be ignored.
Webobjects-dev mailing list      (Webobjects-dev@lists.apple.com)
Help/Unsubscribe/Update your Subscription:
https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com

This email sent to arch...@mail-archive.com

Reply via email to