[ https://issues.apache.org/jira/browse/SOLR-9828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Forest Soup updated SOLR-9828: ------------------------------ Description: We are using oracle jdk8u92 64bit. The jvm memory related options: -Xms32768m -Xmx32768m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/solrdata1/log -XX:+UseG1GC -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=35 -XX:+AggressiveOpts -XX:+AlwaysPreTouch -XX:ConcGCThreads=16 -XX:ParallelGCThreads=18 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/solrdata1/log -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/mnt/solrdata1/log/solr_gc.log It usually works fine. But recently we met very long stop the world young generation GC pause. Some snippets of the gc log are as below: 2016-11-22T20:43:16.436+0000: 2942054.483: Total time for which application threads were stopped: 0.0005510 seconds, Stopping threads took: 0.0000894 seconds 2016-11-22T20:43:16.463+0000: 2942054.509: Total time for which application threads were stopped: 0.0029195 seconds, Stopping threads took: 0.0000804 seconds {Heap before GC invocations=2246 (full 0): garbage-first heap total 26673152K, used 4683965K [0x00007f0c10000000, 0x00007f0c108065c0, 0x00007f1410000000) region size 8192K, 162 young (1327104K), 17 survivors (139264K) Metaspace used 56487K, capacity 57092K, committed 58368K, reserved 59392K 2016-11-22T20:43:16.555+0000: 2942054.602: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 88080384 bytes, new threshold 15 (max 15) - age 1: 28176280 bytes, 28176280 total - age 2: 5632480 bytes, 33808760 total - age 3: 9719072 bytes, 43527832 total - age 4: 6219408 bytes, 49747240 total - age 5: 4465544 bytes, 54212784 total - age 6: 3417168 bytes, 57629952 total - age 7: 5343072 bytes, 62973024 total - age 8: 2784808 bytes, 65757832 total - age 9: 6538056 bytes, 72295888 total - age 10: 6368016 bytes, 78663904 total - age 11: 695216 bytes, 79359120 total , 97.2044320 secs] [Parallel Time: 19.8 ms, GC Workers: 18] [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 2942054612.7, Diff: 10.6] [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 43.5] [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0] [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9] [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1] [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6] [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 308.7] [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 2942054621.8, Diff: 0.0] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.2 ms] [Other: 97184.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 8.5 ms] [Ref Enq: 0.2 ms] [Redirty Cards: 0.2 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.4 ms] [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 4574.2M(25.4G)->3450.8M(26.8G)] Heap after GC invocations=2247 (full 0): garbage-first heap total 28049408K, used 3533601K [0x00007f0c10000000, 0x00007f0c10806b00, 0x00007f1410000000) region size 8192K, 21 young (172032K), 21 survivors (172032K) Metaspace used 56487K, capacity 57092K, committed 58368K, reserved 59392K } [Times: user=0.00 sys=94.28, real=97.19 secs] 2016-11-22T20:44:53.760+0000: 2942151.806: Total time for which application threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 seconds 2016-11-22T20:44:53.762+0000: 2942151.809: Total time for which application threads were stopped: 0.0008138 seconds, Stopping threads took: 0.0001258 seconds And CPU reached near 100% during the GC. The load is normal at that time according to the stats of solr update/select/delete handler and jetty request log. was: We are using oracle jdk8u92 64bit. The jvm memory related options: -Xms32768m -Xmx32768m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/solrdata1/log -XX:+UseG1GC -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled -XX:G1HeapRegionSize=8m -XX:MaxGCPauseMillis=100 -XX:InitiatingHeapOccupancyPercent=35 -XX:+AggressiveOpts -XX:+AlwaysPreTouch -XX:ConcGCThreads=16 -XX:ParallelGCThreads=18 -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/mnt/solrdata1/log -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:/mnt/solrdata1/log/solr_gc.log It usually works fine. But recently we met very long stop the world young generation GC pause. Some snippets of the gc log are as below: 2016-11-22T20:43:16.436+0000: 2942054.483: Total time for which application threads were stopped: 0.0005510 seconds, Stopping threads took: 0.0000894 seconds 2016-11-22T20:43:16.463+0000: 2942054.509: Total time for which application threads were stopped: 0.0029195 seconds, Stopping threads took: 0.0000804 seconds {Heap before GC invocations=2246 (full 0): garbage-first heap total 26673152K, used 4683965K [0x00007f0c10000000, 0x00007f0c108065c0, 0x00007f1410000000) region size 8192K, 162 young (1327104K), 17 survivors (139264K) Metaspace used 56487K, capacity 57092K, committed 58368K, reserved 59392K 2016-11-22T20:43:16.555+0000: 2942054.602: [GC pause (G1 Evacuation Pause) (young) Desired survivor size 88080384 bytes, new threshold 15 (max 15) - age 1: 28176280 bytes, 28176280 total - age 2: 5632480 bytes, 33808760 total - age 3: 9719072 bytes, 43527832 total - age 4: 6219408 bytes, 49747240 total - age 5: 4465544 bytes, 54212784 total - age 6: 3417168 bytes, 57629952 total - age 7: 5343072 bytes, 62973024 total - age 8: 2784808 bytes, 65757832 total - age 9: 6538056 bytes, 72295888 total - age 10: 6368016 bytes, 78663904 total - age 11: 695216 bytes, 79359120 total , 97.2044320 secs] [Parallel Time: 19.8 ms, GC Workers: 18] [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: 2942054612.7, Diff: 10.6] [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: 43.5] [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0] [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192] [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0] [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: 165.9] [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1] [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27] [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.6] [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: 308.7] [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: 2942054621.8, Diff: 0.0] [Code Root Fixup: 0.1 ms] [Code Root Purge: 0.0 ms] [Clear CT: 0.2 ms] [Other: 97184.3 ms] [Choose CSet: 0.0 ms] [Ref Proc: 8.5 ms] [Ref Enq: 0.2 ms] [Redirty Cards: 0.2 ms] [Humongous Register: 0.1 ms] [Humongous Reclaim: 0.1 ms] [Free CSet: 0.4 ms] [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: 4574.2M(25.4G)->3450.8M(26.8G)] Heap after GC invocations=2247 (full 0): garbage-first heap total 28049408K, used 3533601K [0x00007f0c10000000, 0x00007f0c10806b00, 0x00007f1410000000) region size 8192K, 21 young (172032K), 21 survivors (172032K) Metaspace used 56487K, capacity 57092K, committed 58368K, reserved 59392K } [Times: user=0.00 sys=94.28, real=97.19 secs] 2016-11-22T20:44:53.760+0000: 2942151.806: Total time for which application threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 seconds 2016-11-22T20:44:53.762+0000: 2942151.809: Total time for which application threads were stopped: 0.0008138 seconds, Stopping threads took: 0.0001258 seconds And CPU reached near 100% during the GC. The load is not visibly high at that time. > Very long young generation stop the world GC pause > --------------------------------------------------- > > Key: SOLR-9828 > URL: https://issues.apache.org/jira/browse/SOLR-9828 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Affects Versions: 5.3.2 > Environment: Linux Redhat 64bit > Reporter: Forest Soup > > We are using oracle jdk8u92 64bit. > The jvm memory related options: > -Xms32768m > -Xmx32768m > -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/mnt/solrdata1/log > -XX:+UseG1GC > -XX:+PerfDisableSharedMem > -XX:+ParallelRefProcEnabled > -XX:G1HeapRegionSize=8m > -XX:MaxGCPauseMillis=100 > -XX:InitiatingHeapOccupancyPercent=35 > -XX:+AggressiveOpts > -XX:+AlwaysPreTouch > -XX:ConcGCThreads=16 > -XX:ParallelGCThreads=18 > -XX:+HeapDumpOnOutOfMemoryError > -XX:HeapDumpPath=/mnt/solrdata1/log > -verbose:gc > -XX:+PrintHeapAtGC > -XX:+PrintGCDetails > -XX:+PrintGCDateStamps > -XX:+PrintGCTimeStamps > -XX:+PrintTenuringDistribution > -XX:+PrintGCApplicationStoppedTime > -Xloggc:/mnt/solrdata1/log/solr_gc.log > It usually works fine. But recently we met very long stop the world young > generation GC pause. Some snippets of the gc log are as below: > 2016-11-22T20:43:16.436+0000: 2942054.483: Total time for which application > threads were stopped: 0.0005510 seconds, Stopping threads took: 0.0000894 > seconds > 2016-11-22T20:43:16.463+0000: 2942054.509: Total time for which application > threads were stopped: 0.0029195 seconds, Stopping threads took: 0.0000804 > seconds > {Heap before GC invocations=2246 (full 0): > garbage-first heap total 26673152K, used 4683965K [0x00007f0c10000000, > 0x00007f0c108065c0, 0x00007f1410000000) > region size 8192K, 162 young (1327104K), 17 survivors (139264K) > Metaspace used 56487K, capacity 57092K, committed 58368K, reserved > 59392K > 2016-11-22T20:43:16.555+0000: 2942054.602: [GC pause (G1 Evacuation Pause) > (young) > Desired survivor size 88080384 bytes, new threshold 15 (max 15) > - age 1: 28176280 bytes, 28176280 total > - age 2: 5632480 bytes, 33808760 total > - age 3: 9719072 bytes, 43527832 total > - age 4: 6219408 bytes, 49747240 total > - age 5: 4465544 bytes, 54212784 total > - age 6: 3417168 bytes, 57629952 total > - age 7: 5343072 bytes, 62973024 total > - age 8: 2784808 bytes, 65757832 total > - age 9: 6538056 bytes, 72295888 total > - age 10: 6368016 bytes, 78663904 total > - age 11: 695216 bytes, 79359120 total > , 97.2044320 secs] > [Parallel Time: 19.8 ms, GC Workers: 18] > [GC Worker Start (ms): Min: 2942054602.1, Avg: 2942054604.6, Max: > 2942054612.7, Diff: 10.6] > [Ext Root Scanning (ms): Min: 0.0, Avg: 2.4, Max: 6.7, Diff: 6.7, Sum: > 43.5] > [Update RS (ms): Min: 0.0, Avg: 3.0, Max: 15.9, Diff: 15.9, Sum: 54.0] > [Processed Buffers: Min: 0, Avg: 10.7, Max: 39, Diff: 39, Sum: 192] > [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.6] > [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: > 0.0] > [Object Copy (ms): Min: 0.1, Avg: 9.2, Max: 13.4, Diff: 13.3, Sum: > 165.9] > [Termination (ms): Min: 0.0, Avg: 2.5, Max: 2.7, Diff: 2.7, Sum: 44.1] > [Termination Attempts: Min: 1, Avg: 1.5, Max: 3, Diff: 2, Sum: 27] > [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: > 0.6] > [GC Worker Total (ms): Min: 9.0, Avg: 17.1, Max: 19.7, Diff: 10.6, Sum: > 308.7] > [GC Worker End (ms): Min: 2942054621.8, Avg: 2942054621.8, Max: > 2942054621.8, Diff: 0.0] > [Code Root Fixup: 0.1 ms] > [Code Root Purge: 0.0 ms] > [Clear CT: 0.2 ms] > [Other: 97184.3 ms] > [Choose CSet: 0.0 ms] > [Ref Proc: 8.5 ms] > [Ref Enq: 0.2 ms] > [Redirty Cards: 0.2 ms] > [Humongous Register: 0.1 ms] > [Humongous Reclaim: 0.1 ms] > [Free CSet: 0.4 ms] > [Eden: 1160.0M(1160.0M)->0.0B(1200.0M) Survivors: 136.0M->168.0M Heap: > 4574.2M(25.4G)->3450.8M(26.8G)] > Heap after GC invocations=2247 (full 0): > garbage-first heap total 28049408K, used 3533601K [0x00007f0c10000000, > 0x00007f0c10806b00, 0x00007f1410000000) > region size 8192K, 21 young (172032K), 21 survivors (172032K) > Metaspace used 56487K, capacity 57092K, committed 58368K, reserved > 59392K > } > [Times: user=0.00 sys=94.28, real=97.19 secs] > 2016-11-22T20:44:53.760+0000: 2942151.806: Total time for which application > threads were stopped: 97.2053747 seconds, Stopping threads took: 0.0001373 > seconds > 2016-11-22T20:44:53.762+0000: 2942151.809: Total time for which application > threads were stopped: 0.0008138 seconds, Stopping threads took: 0.0001258 > seconds > And CPU reached near 100% during the GC. > The load is normal at that time according to the stats of solr > update/select/delete handler and jetty request log. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org