[ 
https://issues.apache.org/jira/browse/FLINK-23357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17379155#comment-17379155
 ] 

Borland Won edited comment on FLINK-23357 at 7/12/21, 11:59 AM:
----------------------------------------------------------------

and this is who holds the reference of the classloader (50 of 1724 entries)

class com.fasterxml.jackson.annotation.JsonIgnore @ 0x63c0f6528
class com.fasterxml.jackson.core.io.NumberOutput @ 0x63c0f6668
class com.fasterxml.jackson.databind.JsonNode @ 0x63c0f77c8
class com.fasterxml.jackson.databind.deser.impl.MethodProperty @ 0x63c0f7968
class com.alibaba.druid.mock.MockDriverMBean @ 0x63c123668
org.apache.flink.util.ChildFirstClassLoader @ 0x63c123748
class org.apache.ibatis.javassist.CtArray @ 0x63c129040
class org.apache.ibatis.javassist.CtClass @ 0x63c1290b8
class org.apache.ibatis.javassist.NotFoundException @ 0x63c1291a8
class org.apache.ibatis.javassist.ClassPool @ 0x63c129228
class org.apache.ibatis.ognl.OgnlRuntime$ClassPropertyMethodCache @ 0x63c1292b0
java.security.ProtectionDomain @ 0x63c133ea0
class org.apache.ibatis.ognl.ObjectArrayPool @ 0x63c1b4770
class org.apache.ibatis.ognl.EvaluationPool @ 0x63c1b47d8
class org.apache.ibatis.ognl.internal.Entry @ 0x63c1b4840
class org.apache.ibatis.ognl.internal.ClassCacheImpl @ 0x63c1b48b8
class org.apache.ibatis.ognl.AccessibleObjectHandlerPreJDK9 @ 0x63c1b4928
class org.apache.ibatis.ognl.AccessibleObjectHandler @ 0x63c1b4990
class org.apache.ibatis.ognl.ClassResolver @ 0x63c1b4a08
class org.apache.ibatis.ognl.NoSuchPropertyException @ 0x63c1b4a80
class org.apache.ibatis.ognl.OgnlInvokePermission @ 0x63c1b4af8
class org.apache.ibatis.ognl.MethodAccessor @ 0x63c1b4b70
class org.apache.ibatis.ognl.NullHandler @ 0x63c1b4be8
class org.apache.ibatis.ognl.ElementsAccessor @ 0x63c1b4c60
class org.apache.ibatis.ognl.enhance.OgnlExpressionCompiler @ 0x63c1b4cd8
class org.apache.ibatis.ognl.internal.ClassCache @ 0x63c1b4da0
class org.apache.ibatis.ognl.MethodFailedException @ 0x63c1b4e18
class org.apache.ibatis.ognl.OgnlException @ 0x63c1b4e90
class org.apache.ibatis.ognl.OgnlRuntime @ 0x63c1b4f10
class org.apache.ibatis.scripting.xmltags.DynamicContext$ContextAccessor @ 
0x63c1bc3d0
class org.apache.ibatis.scripting.xmltags.DynamicContext$ContextMap @ 
0x63c1bc438
class org.apache.ibatis.ognl.PropertyAccessor @ 0x63c1bc4b8
class org.apache.ibatis.scripting.xmltags.DynamicContext @ 0x63c1bc530
class org.apache.ibatis.scripting.defaults.RawSqlSource @ 0x63c2045f0
class org.apache.ibatis.scripting.xmltags.MixedSqlNode @ 0x63c204658
class org.apache.ibatis.scripting.xmltags.StaticTextSqlNode @ 0x63c2046c0
class org.apache.ibatis.scripting.xmltags.TextSqlNode$DynamicCheckerTokenParser 
@ 0x63c204728
class org.apache.ibatis.scripting.xmltags.TextSqlNode @ 0x63c204790
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$BindHandler @ 
0x63c2047f8
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$OtherwiseHandler @ 
0x63c204860
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$ChooseHandler @ 
0x63c2048c8
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$IfHandler @ 
0x63c204930
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$ForEachHandler @ 
0x63c204998
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$SetHandler @ 
0x63c204a00
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$WhereHandler @ 
0x63c204a68
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$TrimHandler @ 
0x63c204ad0
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder$NodeHandler @ 
0x63c204b38
class org.apache.ibatis.scripting.xmltags.XMLScriptBuilder @ 0x63c204bb0
class org.apache.ibatis.executor.keygen.NoKeyGenerator @ 0x63c204c18
class org.apache.ibatis.builder.xml.XMLIncludeTransformer @ 0x63c204c98


was (Author: borland):
and this is who holds the reference of the classloader (50 of 1724 entries)

!image-2021-07-12-19-56-38-528.png!

> jobmanager metaspace oom
> ------------------------
>
>                 Key: FLINK-23357
>                 URL: https://issues.apache.org/jira/browse/FLINK-23357
>             Project: Flink
>          Issue Type: Bug
>    Affects Versions: 1.12.2
>            Reporter: Borland Won
>            Priority: Major
>         Attachments: image-2021-07-12-16-57-55-256.png, 
> image-2021-07-12-17-06-17-218.png, image-2021-07-12-19-20-13-742.png, 
> image-2021-07-12-19-30-38-245.png, path to gc roots.png
>
>
> *Flink Version: 1.12.2*
> Hi .  I created a standalone HA cluster(with 3 taskmanager and 2 jobmanager), 
> and repeatedly submit new jobs to the cluster and cancel old jobs  via rest 
> api . Then jobmanager master got the increasing metaspace.
>   !image-2021-07-12-16-57-55-256.png!
> Soon it will OOM and get the exception below:
> 2021-06-21 15:44:06,637 ERROR 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler   [] - Unhandled 
> exception.2021-06-21 15:44:06,637 ERROR 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler   [] - Unhandled 
> exception.java.util.concurrent.CompletionException: 
> org.apache.flink.client.program.ProgramInvocationException: The program's 
> entry point class 'xxx.xxx.xxx.XXXBootstrap' caused an exception during 
> initialization: Metaspace. The metaspace out-of-memory error has occurred. 
> This can mean two things: either Flink Master requires a larger size of JVM 
> metaspace to load classes or there is a class loading leak. In the first case 
> 'jobmanager.memory.jvm-metaspace.size' configuration option should be 
> increased. If the error persists (usually in cluster after several job 
> (re-)submissions) then there is probably a class loading leak in user code or 
> some of its dependencies which has to be investigated and fixed. The Flink 
> Master has to be shutdown... at 
> org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.toPackagedProgram(JarHandlerUtils.java:184)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.applyToConfiguration(JarHandlerUtils.java:141)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:95)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.webmonitor.handlers.JarRunHandler.handleRequest(JarRunHandler.java:53)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.AbstractRestHandler.respondToRequest(AbstractRestHandler.java:83)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.AbstractHandler.respondAsLeader(AbstractHandler.java:195)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.lambda$channelRead0$0(LeaderRetrievalHandler.java:83)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> java.util.Optional.ifPresent(Optional.java:159) [?:1.8.0_292] at 
> org.apache.flink.util.OptionalConsumer.ifPresent(OptionalConsumer.java:45) 
> [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:80)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:49)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.router.RouterHandler.routed(RouterHandler.java:115)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:94)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:55)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:208)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:69)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>  [flink-dist_2.11-1.12.2.jar:1.12.2] at java.lang.Thread.run(Thread.java:748) 
> [?:1.8.0_292]Caused by: 
> org.apache.flink.client.program.ProgramInvocationException: The program's 
> entry point class 'xxx.xxx.xxx.XXXBootstrap' caused an exception during 
> initialization: Metaspace. The metaspace out-of-memory error has occurred. 
> This can mean two things: either Flink Master requires a larger size of JVM 
> metaspace to load classes or there is a class loading leak. In the first case 
> 'jobmanager.memory.jvm-metaspace.size' configuration option should be 
> increased. If the error persists (usually in cluster after several job 
> (re-)submissions) then there is probably a class loading leak in user code or 
> some of its dependencies which has to be investigated and fixed. The Flink 
> Master has to be shutdown... at 
> org.apache.flink.client.program.PackagedProgram.loadMainClass(PackagedProgram.java:497)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:152)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.client.program.PackagedProgram.<init>(PackagedProgram.java:64)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.client.program.PackagedProgram$Builder.build(PackagedProgram.java:685)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] at 
> org.apache.flink.runtime.webmonitor.handlers.utils.JarHandlerUtils$JarHandlerContext.toPackagedProgram(JarHandlerUtils.java:182)
>  ~[flink-dist_2.11-1.12.2.jar:1.12.2] ... 50 moreCaused by: 
> java.lang.OutOfMemoryError: Metaspace. The metaspace out-of-memory error has 
> occurred. This can mean two things: either Flink Master requires a larger 
> size of JVM metaspace to load classes or there is a class loading leak. In 
> the first case 'jobmanager.memory.jvm-metaspace.size' configuration option 
> should be increased. If the error persists (usually in cluster after several 
> job (re-)submissions) then there is probably a class loading leak in user 
> code or some of its dependencies which has to be investigated and fixed. The 
> Flink Master has to be shutdown...
>  
>  
>  
> I know " there is probably a class loading leak in user code or some of its 
> dependencies which has to be investigated and fixed" , but don't know how to 
> recognize it.
> I jmap the jobmanager process which is coming to OOM . Bootstrap is my 
> entrance class with main method. Why it and the other function class were 
> loaded 103 times ?
> !image-2021-07-12-17-06-17-218.png!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to