Re: FLINK-14316 happens on version 1.13.2

2021-09-14 Thread Xiangyu Su
13.2] >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212) >> ~[flink-dist_2.11-1.13.2.jar:1.13.2] >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305) >> ~[flink-dist_2.11-1.13.2

Re: FLINK-14316 happens on version 1.13.2

2021-09-07 Thread Matthias Pohl
ecutionState(JobMaster.java:441) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > > Thanks for your support. > Best Regards, > > On Thu, 2 Sept 2021 at 16:43, Yun Gao wrote: > >> Hi Xiangyu, >> >> There might be different reasons for the "Job Leader... lost leaders

Re: FLINK-14316 happens on version 1.13.2

2021-09-03 Thread Xiangyu Su
-Original Mail ------ > *Sender:*Xiangyu Su > *Send Date:*Wed Sep 1 15:31:03 2021 > *Recipients:*user > *Subject:*FLINK-14316 happens on version 1.13.2 > >> Hello Everyone, >> We upgrade flink to 1.13.2, and we were facing randomly the "Job leader >>

Re: FLINK-14316 happens on version 1.13.2

2021-09-02 Thread Yun Gao
Hi Xiangyu, There might be different reasons for the "Job Leader... lost leadership" problem. Do you see the erros in the TM log ? If so, the root cause might be that the connection between the TM and ZK is lost or timeout. Have you checked the GC status of the TM side ? If the GC is ok, could

FLINK-14316 happens on version 1.13.2

2021-09-01 Thread Xiangyu Su
Hello Everyone, We upgrade flink to 1.13.2, and we were facing randomly the "Job leader ... lost leadership" error, the job keep restarting and failing... It behaviours like this ticket https://issues.apache.org/jira/browse/FLINK-14316 Did anybody had same issue or any suggestions? Best Regards,