Forwarding the discussion back to the user mailing list.

On Thu, Sep 2, 2021 at 12:25 PM Till Rohrmann <trohrm...@apache.org> wrote:

> The stack trace looks ok. This happens whenever the leader loses
> leadership and this can have different reasons. What's more interesting is
> what happens before and after and what's happening on the system you use
> for HA (probably ZooKeeper). Maybe the connection to ZooKeeper is unstable
> or there is some other problem.
>
> Cheers,
> Till
>
> On Thu, Sep 2, 2021 at 12:20 PM Xiangyu Su <xian...@smaato.com> wrote:
>
>> Hi Till,
>> thank you very much for this fast reply!
>> This issue happens very randomly, I did some tries to reproduce that, but
>> not easy...
>> and here is the exception stacktrace from JM logs and TM logs:
>>
>> java.lang.Exception: Job leader for job id
>> 6fd38dedbca7bf65bfa57cb306930fa9 lost leadership.
>> at
>> org.apache.flink.runtime.taskexecutor.TaskExecutor$JobLeaderListenerImpl.lambda$null$2(TaskExecutor.java:2189)
>> at java.util.Optional.ifPresent(Optional.java:159)
>> at
>> org.apache.flink.runtime.taskexecutor.TaskExecutor$JobLeaderListenerImpl.lambda$jobManagerLostLeadership$3(TaskExecutor.java:2187)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:440)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:208)
>> at
>> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
>> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
>> at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123)
>> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
>> at akka.actor.Actor$class.aroundReceive(Actor.scala:517)
>> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
>> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
>> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>> On Thu, 2 Sept 2021 at 12:14, Till Rohrmann <trohrm...@apache.org> wrote:
>>
>>> Hi Xiangyu,
>>>
>>> Do you have the logs of the problematic test run available? Ideally, we
>>> can enable the DEBUG log level to get some more information. I think this
>>> information would be needed to figure out the problem.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Thu, Sep 2, 2021 at 11:47 AM Xiangyu Su <xian...@smaato.com> wrote:
>>>
>>>> Hello Everyone,
>>>> Hello Till,
>>>> We upgrade flink to 1.13.2, and we were facing randomly the "Job leader
>>>> ... lost leadership" error, the job keep restarting and failing...
>>>> It behaviours like this ticket
>>>> https://issues.apache.org/jira/browse/FLINK-14316
>>>>
>>>> Did anybody had same issue or any suggestions?
>>>>
>>>> Best Regards,
>>>>
>>>> --
>>>> Xiangyu Su
>>>> Java Developer
>>>> xian...@smaato.com
>>>>
>>>> Smaato Inc.
>>>> San Francisco - New York - Hamburg - Singapore
>>>> www.smaato.com
>>>>
>>>> Germany:
>>>>
>>>> Barcastraße 5
>>>>
>>>> 22087 Hamburg
>>>>
>>>> Germany
>>>> M 0049(176)43330282
>>>>
>>>> The information contained in this communication may be CONFIDENTIAL and
>>>> is intended only for the use of the recipient(s) named above. If you are
>>>> not the intended recipient, you are hereby notified that any dissemination,
>>>> distribution, or copying of this communication, or any of its contents, is
>>>> strictly prohibited. If you have received this communication in error,
>>>> please notify the sender and delete/destroy the original message and any
>>>> copy of it from your computer or paper files.
>>>>
>>>
>>
>> --
>> Xiangyu Su
>> Java Developer
>> xian...@smaato.com
>>
>> Smaato Inc.
>> San Francisco - New York - Hamburg - Singapore
>> www.smaato.com
>>
>> Germany:
>>
>> Barcastraße 5
>>
>> 22087 Hamburg
>>
>> Germany
>> M 0049(176)43330282
>>
>> The information contained in this communication may be CONFIDENTIAL and
>> is intended only for the use of the recipient(s) named above. If you are
>> not the intended recipient, you are hereby notified that any dissemination,
>> distribution, or copying of this communication, or any of its contents, is
>> strictly prohibited. If you have received this communication in error,
>> please notify the sender and delete/destroy the original message and any
>> copy of it from your computer or paper files.
>>
>

Reply via email to