Also that I handled exception is printed in job managers. On Tue., Jan. 18, 2022, 10:11 a.m. John Smith, <java.dev....@gmail.com> wrote:
> I actually mean the job manager. I run a total of three job managers for > HA. > > For example of I click running jobs, it displays light grey boxes for a > while and then top right corner throws internal server error. But after if > I refresh it's ok I see the list. It seems to happen on the non leader job > manager UIs. Looking at the UI metric for the job manager doesn't seem to > indicate anything out of the ordinary. > > Is there a java 8 command tool that can hook into the live PID and display > GC stats? > > On Mon., Jan. 17, 2022, 12:06 a.m. Caizhi Weng, <tsreape...@gmail.com> > wrote: > >> Hi! >> >> "When I use the UI and just navigate from one jobmanager ui to the >> other", I guess you mean you're talking about task managers instead of job >> managers (there is only one job manager per web ui). >> >> This long pause usually indicates that the component (no matter task >> managers or job manager) you're jumping to is very busy so that they can't >> handle web requests. This is not a healthy state and you should look into >> it. The most probable cause is heavy GC. To determine this you can print GC >> details to log and see if there are long GC pauses. >> >> John Smith <java.dev....@gmail.com> 于2022年1月15日周六 09:53写道: >> >>> Hi using 1.14.2, running 3 job nodes. >>> >>> Everything seems to work ok. >>> >>> When I use the UI and just navigate from one jobmanager ui to the other, >>> it sometimes seems to take long or timeout and I see an "Internal Server >>> Error" popup on the top right messages. >>> >>> Looking at the logs I see this, but not sure it's related... >>> >>> 2022-01-15 00:04:44,925 ERROR >>> org.apache.flink.runtime.rest.handler.job.JobExceptionsHandler [] - >>> Unhandled exception. >>> java.util.concurrent.CancellationException: null >>> at >>> java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2276) >>> ~[?:1.8.0_312] >>> at >>> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.getExecutionGraphInternal(DefaultExecutionGraphCache.java:98) >>> ~[flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.legacy.DefaultExecutionGraphCache.getExecutionGraphInfo(DefaultExecutionGraphCache.java:67) >>> ~[flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.job.AbstractExecutionGraphHandler.handleRequest(AbstractExecutionGraphHandler.java:81) >>> ~[flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.AbstractRestHandler.respondToRequest(AbstractRestHandler.java:83) >>> ~[flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.AbstractHandler.respondAsLeader(AbstractHandler.java:195) >>> ~[flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.lambda$channelRead0$0(LeaderRetrievalHandler.java:83) >>> ~[flink-dist_2.12-1.14.2.jar:1.14.2] >>> at java.util.Optional.ifPresent(Optional.java:159) [?:1.8.0_312] >>> at >>> org.apache.flink.util.OptionalConsumer.ifPresent(OptionalConsumer.java:45) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:80) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.LeaderRetrievalHandler.channelRead0(LeaderRetrievalHandler.java:49) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.router.RouterHandler.routed(RouterHandler.java:115) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:94) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.handler.router.RouterHandler.channelRead0(RouterHandler.java:55) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:238) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.runtime.rest.FileUploadHandler.channelRead0(FileUploadHandler.java:71) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelRead(CombinedChannelDuplexHandler.java:436) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.CombinedChannelDuplexHandler.channelRead(CombinedChannelDuplexHandler.java:251) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at >>> org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) >>> [flink-dist_2.12-1.14.2.jar:1.14.2] >>> at java.lang.Thread.run(Thread.java:748) [?:1.8.0_312] >>> >>>