建议检查一下JM的GC情况。

wjw_bigd...@163.com <wjw_bigd...@163.com> 于2024年7月1日周一 17:18写道:

> 退订
>
>
>
> ---- 回复的原邮件 ----
> | 发件人 | wjw_bigd...@163.com |
> | 日期 | 2024年07月01日 17:13 |
> | 收件人 | user-zh<user-zh@flink.apache.org> |
> | 抄送至 | |
> | 主题 | 回复:这绝对算是bug |
> 退订
>
>
>
> ---- 回复的原邮件 ----
> | 发件人 | 星海<2278179...@qq.com.INVALID> |
> | 日期 | 2024年06月29日 21:31 |
> | 收件人 | user-zh<user-zh@flink.apache.org> |
> | 抄送至 | |
> | 主题 | 回复: 这绝对算是bug |
> 退订
>
>
> ------------------&nbsp;原始邮件&nbsp;------------------
> 发件人:
>                                                   "user-zh"
>                                                                     <
> cfso3...@126.com&gt;;退订
> 发送时间:&nbsp;2024年6月29日(星期六) 晚上8:24
> 收件人:&nbsp;"user-zh"<user-zh@flink.apache.org&gt;;
>
> 主题:&nbsp;Re: 这绝对算是bug
>
>
>
> 连接没问题,主要是tm一直在处理写入流,我也看了一下负载,其实不高,但就是不相应,导致报timeout,然后就是最开始那个错误!
> 发自我的 iPhone
>
> &gt; 在 2024年6月29日,16:49,Zhanghao Chen <zhanghao.c...@outlook.com&gt; 写道:
> &gt;
> &gt; Hi,从报错看是 JM 丢主了,导致 TM 上 task 全部关停。看下 JM 侧是不是 HA 连接有问题呢?
> &gt;
> &gt; Best,
> &gt; Zhanghao Chen
> &gt; ________________________________
> &gt; From: Cuixb <cfso3...@126.com&gt;
> &gt; Sent: Saturday, June 29, 2024 10:31
> &gt; To: user-zh@flink.apache.org <user-zh@flink.apache.org&gt;
> &gt; Subject: 这绝对算是bug
> &gt;
> &gt; 生产环境Flink 1.16.2
> &gt;
> &gt; 2024-06-29 09:17:23
> &gt; java.lang.Exception: Job leader for job id
> 8ccdd299194a686e3ecda602c3c75bf3 lost leadership.
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.taskexecutor.TaskExecutor$JobLeaderListenerImpl.lambda$null$2(TaskExecutor.java:2310)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.util.Optional.ifPresent(Optional.java:159)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.taskexecutor.TaskExecutor$JobLeaderListenerImpl.lambda$jobManagerLostLeadership$3(TaskExecutor.java:2308)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRunAsync$4(AkkaRpcActor.java:453)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:453)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:218)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at akka.japi.pf
> .UnitCaseStatement.apply(CaseStatements.scala:24)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at akka.japi.pf
> .UnitCaseStatement.apply(CaseStatements.scala:20)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at akka.japi.pf
> .UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.actor.Actor.aroundReceive(Actor.scala:537)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.actor.Actor.aroundReceive$(Actor.scala:535)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.actor.ActorCell.invoke(ActorCell.scala:548)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.dispatch.Mailbox.run(Mailbox.scala:231)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> akka.dispatch.Mailbox.exec(Mailbox.scala:243)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
> &gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at
> java.util.concurrent.ForkJoinPool$WorkQueue.runTask(For
> &gt;
> &gt; 发自我的 iPhone

回复