Forwarding the discussion back to the user mailing list. On Thu, Sep 2, 2021 at 12:25 PM Till Rohrmann <trohrm...@apache.org> wrote:
> The stack trace looks ok. This happens whenever the leader loses > leadership and this can have different reasons. What's more interesting is > what happens before and after and what's happening on the system you use > for HA (probably ZooKeeper). Maybe the connection to ZooKeeper is unstable > or there is some other problem. > > Cheers, > Till > > On Thu, Sep 2, 2021 at 12:20 PM Xiangyu Su <xian...@smaato.com> wrote: > >> Hi Till, >> thank you very much for this fast reply! >> This issue happens very randomly, I did some tries to reproduce that, but >> not easy... >> and here is the exception stacktrace from JM logs and TM logs: >> >> java.lang.Exception: Job leader for job id >> 6fd38dedbca7bf65bfa57cb306930fa9 lost leadership. >> at >> org.apache.flink.runtime.taskexecutor.TaskExecutor$JobLeaderListenerImpl.lambda$null$2(TaskExecutor.java:2189) >> at java.util.Optional.ifPresent(Optional.java:159) >> at >> org.apache.flink.runtime.taskexecutor.TaskExecutor$JobLeaderListenerImpl.lambda$jobManagerLostLeadership$3(TaskExecutor.java:2187) >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440) >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208) >> at >> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158) >> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) >> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) >> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) >> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) >> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) >> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) >> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) >> at akka.actor.Actor$class.aroundReceive(Actor.scala:517) >> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) >> at akka.actor.ActorCell.invoke(ActorCell.scala:561) >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) >> at akka.dispatch.Mailbox.run(Mailbox.scala:225) >> at akka.dispatch.Mailbox.exec(Mailbox.scala:235) >> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >> at >> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >> at >> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) >> >> On Thu, 2 Sept 2021 at 12:14, Till Rohrmann <trohrm...@apache.org> wrote: >> >>> Hi Xiangyu, >>> >>> Do you have the logs of the problematic test run available? Ideally, we >>> can enable the DEBUG log level to get some more information. I think this >>> information would be needed to figure out the problem. >>> >>> Cheers, >>> Till >>> >>> On Thu, Sep 2, 2021 at 11:47 AM Xiangyu Su <xian...@smaato.com> wrote: >>> >>>> Hello Everyone, >>>> Hello Till, >>>> We upgrade flink to 1.13.2, and we were facing randomly the "Job leader >>>> ... lost leadership" error, the job keep restarting and failing... >>>> It behaviours like this ticket >>>> https://issues.apache.org/jira/browse/FLINK-14316 >>>> >>>> Did anybody had same issue or any suggestions? >>>> >>>> Best Regards, >>>> >>>> -- >>>> Xiangyu Su >>>> Java Developer >>>> xian...@smaato.com >>>> >>>> Smaato Inc. >>>> San Francisco - New York - Hamburg - Singapore >>>> www.smaato.com >>>> >>>> Germany: >>>> >>>> Barcastraße 5 >>>> >>>> 22087 Hamburg >>>> >>>> Germany >>>> M 0049(176)43330282 >>>> >>>> The information contained in this communication may be CONFIDENTIAL and >>>> is intended only for the use of the recipient(s) named above. If you are >>>> not the intended recipient, you are hereby notified that any dissemination, >>>> distribution, or copying of this communication, or any of its contents, is >>>> strictly prohibited. If you have received this communication in error, >>>> please notify the sender and delete/destroy the original message and any >>>> copy of it from your computer or paper files. >>>> >>> >> >> -- >> Xiangyu Su >> Java Developer >> xian...@smaato.com >> >> Smaato Inc. >> San Francisco - New York - Hamburg - Singapore >> www.smaato.com >> >> Germany: >> >> Barcastraße 5 >> >> 22087 Hamburg >> >> Germany >> M 0049(176)43330282 >> >> The information contained in this communication may be CONFIDENTIAL and >> is intended only for the use of the recipient(s) named above. If you are >> not the intended recipient, you are hereby notified that any dissemination, >> distribution, or copying of this communication, or any of its contents, is >> strictly prohibited. If you have received this communication in error, >> please notify the sender and delete/destroy the original message and any >> copy of it from your computer or paper files. >> >