[ 
https://issues.apache.org/jira/browse/FLINK-36394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

JieTan updated FLINK-36394:
---------------------------
    Description: 
We should set the size of the Init JVM Metaspace to a sane default, like 
{{{}-XX:MetaspaceSize=128m{}}}.

When a Flink job starts, it needs to load class metadata information, including 
class structure information, method information, and field information. As a 
result, the JVM frequently performs FullGC. Sometimes, the Metadata GC 
Threshold (Full GC) is 3.7s long before the Akka System is started.

 
{code:java}
2024-07-23T20:08:40.554+0800: 1.801: [Full GC (Metadata GC Threshold) 
2024-07-23T20:08:40.554+0800: 1.801: [Tenured: 0K->14029K(966656K), 0.0336205 
secs] 108273K->14029K(1401664K), [Metaspace: 20594K->20594K(1067008K)], 
0.0337494 secs] [Times: user=0.02 sys=0.00, real=0.04 secs] 
2024-07-23T20:08:42.052+0800: 3.300: [Full GC (Metadata GC Threshold) 
2024-07-23T20:08:42.052+0800: 3.300: [Tenured: 14029K->28979K(966656K), 
0.0503497 secs] 184494K->28979K(1401664K), [Metaspace: 
34384K->34384K(1079296K)], 0.0505108 secs] [Times: user=0.04 sys=0.00, 
real=0.05 secs] 
2024-07-23T20:08:45.064+0800: 6.312: [Full GC (Metadata GC Threshold) 
2024-07-23T20:08:45.064+0800: 6.312: [Tenured: 28979K->54550K(966656K), 
0.0876744 secs] 248437K->54550K(1401664K), [Metaspace: 
57015K->57015K(1099776K)], 0.0879053 secs] [Times: user=0.07 sys=0.01, 
real=0.09 secs]  {code}
 

The final Metaspace size takes up 100MB+. 
h2. *Solution:* 

{*}I{*}f we configure Init Metaspace size by default. As a result, Flink job 
Metaspace FullGC reduced.
 * Add *jobmanager.memory.jvm-init-metaspace.size* to JobManagerOptions.
 * Add *taskmanager.memory.jvm-init-metaspace.size* to JobManagerOptions

 
{code:java}
OpenJDK 64-Bit Server VM (25.362-b09) for linux-amd64 JRE 
(1.8.0_362-ByteOpenJDK-b09), built on Feb 20 2023 09:42:31 by "root" with gcc 
8.3.0
Memory: 4k page, physical 4018480k(137636k free), swap 0k(0k free)
CommandLine flags: -XX:CompressedClassSpaceSize=260046848 
-XX:GCLogFileSize=104857600 -XX:InitialHeapSize=3368026112 
-XX:MaxHeapSize=3368026112 -XX:MaxMetaspaceSize=268435456 
-XX:MetaspaceSize=134217728 -XX:NumberOfGCLogFiles=5 -XX:+PrintGC 
-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops 
-XX:+UseGCLogFileRotation  {code}
 

  was:
We should set the size of the Init JVM Metaspace to a sane default, like 
{{{}-XX:MetaspaceSize=128m{}}}.

When a Flink job starts, it needs to load class metadata information, including 
class structure information, method information, and field information. As a 
result, the JVM frequently performs FullGC. Sometimes, the Metadata GC 
Threshold (Full GC) is 3.7s long before the Akka System is started.

 
{code:java}
2024-07-23T20:08:40.554+0800: 1.801: [Full GC (Metadata GC Threshold) 
2024-07-23T20:08:40.554+0800: 1.801: [Tenured: 0K->14029K(966656K), 0.0336205 
secs] 108273K->14029K(1401664K), [Metaspace: 20594K->20594K(1067008K)], 
0.0337494 secs] [Times: user=0.02 sys=0.00, real=0.04 secs] 
2024-07-23T20:08:42.052+0800: 3.300: [Full GC (Metadata GC Threshold) 
2024-07-23T20:08:42.052+0800: 3.300: [Tenured: 14029K->28979K(966656K), 
0.0503497 secs] 184494K->28979K(1401664K), [Metaspace: 
34384K->34384K(1079296K)], 0.0505108 secs] [Times: user=0.04 sys=0.00, 
real=0.05 secs] 
2024-07-23T20:08:45.064+0800: 6.312: [Full GC (Metadata GC Threshold) 
2024-07-23T20:08:45.064+0800: 6.312: [Tenured: 28979K->54550K(966656K), 
0.0876744 secs] 248437K->54550K(1401664K), [Metaspace: 
57015K->57015K(1099776K)], 0.0879053 secs] [Times: user=0.07 sys=0.01, 
real=0.09 secs]  {code}
 

The final Metaspace size takes up 100MB+. 
h2. *Solution:* 

{*}I{*}f we configure Init Metaspace size by default. As a result, Flink job 
Metaspace FullGC reduced.
{code:java}
OpenJDK 64-Bit Server VM (25.362-b09) for linux-amd64 JRE 
(1.8.0_362-ByteOpenJDK-b09), built on Feb 20 2023 09:42:31 by "root" with gcc 
8.3.0
Memory: 4k page, physical 4018480k(137636k free), swap 0k(0k free)
CommandLine flags: -XX:CompressedClassSpaceSize=260046848 
-XX:GCLogFileSize=104857600 -XX:InitialHeapSize=3368026112 
-XX:MaxHeapSize=3368026112 -XX:MaxMetaspaceSize=268435456 
-XX:MetaspaceSize=134217728 -XX:NumberOfGCLogFiles=5 -XX:+PrintGC 
-XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops 
-XX:+UseGCLogFileRotation  {code}
 


> Configure Init Metaspace size by default
> ----------------------------------------
>
>                 Key: FLINK-36394
>                 URL: https://issues.apache.org/jira/browse/FLINK-36394
>             Project: Flink
>          Issue Type: Bug
>          Components: API / Core
>    Affects Versions: 1.16.0, 1.17.2
>            Reporter: JieTan
>            Priority: Major
>         Attachments: JVMMetaspace.png, JobManagerJVMOptions.png, 
> TaskManagerJVMOptions.png
>
>
> We should set the size of the Init JVM Metaspace to a sane default, like 
> {{{}-XX:MetaspaceSize=128m{}}}.
> When a Flink job starts, it needs to load class metadata information, 
> including class structure information, method information, and field 
> information. As a result, the JVM frequently performs FullGC. Sometimes, the 
> Metadata GC Threshold (Full GC) is 3.7s long before the Akka System is 
> started.
>  
> {code:java}
> 2024-07-23T20:08:40.554+0800: 1.801: [Full GC (Metadata GC Threshold) 
> 2024-07-23T20:08:40.554+0800: 1.801: [Tenured: 0K->14029K(966656K), 0.0336205 
> secs] 108273K->14029K(1401664K), [Metaspace: 20594K->20594K(1067008K)], 
> 0.0337494 secs] [Times: user=0.02 sys=0.00, real=0.04 secs] 
> 2024-07-23T20:08:42.052+0800: 3.300: [Full GC (Metadata GC Threshold) 
> 2024-07-23T20:08:42.052+0800: 3.300: [Tenured: 14029K->28979K(966656K), 
> 0.0503497 secs] 184494K->28979K(1401664K), [Metaspace: 
> 34384K->34384K(1079296K)], 0.0505108 secs] [Times: user=0.04 sys=0.00, 
> real=0.05 secs] 
> 2024-07-23T20:08:45.064+0800: 6.312: [Full GC (Metadata GC Threshold) 
> 2024-07-23T20:08:45.064+0800: 6.312: [Tenured: 28979K->54550K(966656K), 
> 0.0876744 secs] 248437K->54550K(1401664K), [Metaspace: 
> 57015K->57015K(1099776K)], 0.0879053 secs] [Times: user=0.07 sys=0.01, 
> real=0.09 secs]  {code}
>  
> The final Metaspace size takes up 100MB+. 
> h2. *Solution:* 
> {*}I{*}f we configure Init Metaspace size by default. As a result, Flink job 
> Metaspace FullGC reduced.
>  * Add *jobmanager.memory.jvm-init-metaspace.size* to JobManagerOptions.
>  * Add *taskmanager.memory.jvm-init-metaspace.size* to JobManagerOptions
>  
> {code:java}
> OpenJDK 64-Bit Server VM (25.362-b09) for linux-amd64 JRE 
> (1.8.0_362-ByteOpenJDK-b09), built on Feb 20 2023 09:42:31 by "root" with gcc 
> 8.3.0
> Memory: 4k page, physical 4018480k(137636k free), swap 0k(0k free)
> CommandLine flags: -XX:CompressedClassSpaceSize=260046848 
> -XX:GCLogFileSize=104857600 -XX:InitialHeapSize=3368026112 
> -XX:MaxHeapSize=3368026112 -XX:MaxMetaspaceSize=268435456 
> -XX:MetaspaceSize=134217728 -XX:NumberOfGCLogFiles=5 -XX:+PrintGC 
> -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
> -XX:+UseCompressedClassPointers -XX:+UseCompressedOops 
> -XX:+UseGCLogFileRotation  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to