Does this happen regularly? As in, the cluster initially runs fine and around the same time-frame runs into problems?

Can you provide the full logs for the task and jobmanager?

On 29/11/2019 08:42, Eray Arslan wrote:
Hi Chesnay,
Thank you for reply.
I figure out that issue with using livenessProbe on Task Manager deployment. But I think it is still a workaround.

I am using Flink 1.9.1 (currently its latest version)
And I am getting "connection unexpectedly closed by remote task manager" error on Task Manager. Because of that cluster losing Task Manager and job cannot restart cause not enough task manager on cluster.

Thanks

Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>>, 28 Kas 2019 Per, 18:55 tarihinde şunu yazdı:

    The akka.watch configuration options haven't been used for a while
    irrespective of FLINK-13883 (but I can't quite tell atm since when).

    Let's start with what version of Flink you are using, and what the
    taskmanager/jobmanager logs say.

    On 25/11/2019 12:05, Eray Arslan wrote:
    > Hi,
    >
    > I have some trouble with my HA K8 cluster.
    > Current my Flink application has infinite stream. (With 12
    parallelism)
    > After few days I am losing my task managers. And they never
    reconnect
    > to job manager.
    > Because of this, application cannot get restored with restart
    policy.
    >
    > I did few searches and I found “akka.watch” configurations. But
    they
    > didn’t work.
    > I think this issue will solve the problem. Am I right?
    > (https://issues.apache.org/jira/browse/FLINK-13883). Is there any
    > workaround I can apply to solve this problem?
    >
    > Thanks
    >
    > Eray
    >
    >



--

*Eray Arslan*
Yazılım Uzmanı  / Software Specialists
eray.ars...@hepsiburada.com <mailto:eray.ars...@hepsiburada.com>

_+90 537 738 14 34_
Trump Towers Mecidiyeköy Yolu No: 12 Kule 2, Mecidiyeköy - Şişli / İstanbul - Türkiye


Reply via email to