Re: Jobmanager trying to be registered for Zombie Job

2022-04-26 Thread Matthias Pohl
Hi Peter, based on our analysis the issue already existed before 1.15, yes. We couldn't come up with any other reasoning. It was just never reported... or noticing an older ticket. Matthias On Mon, Apr 25, 2022 at 6:21 PM Peter Schrott wrote: > Hi Matthias, > > You are welcome & thanks a lot

Re: Jobmanager trying to be registered for Zombie Job

2022-04-25 Thread Peter Schrott
Hi Matthias, You are welcome & thanks a lot for your help too! It's not quite clear to me, the bug was already there since 1.13.6 but not reported yet (FLINK-27354 is a new ticket)? Best, Peter On Mon, Apr 25, 2022 at 5:48 PM Matthias Pohl wrote: > Thanks again, Peter for sharing your logs.

Re: Jobmanager trying to be registered for Zombie Job

2022-04-25 Thread Matthias Pohl
Thanks again, Peter for sharing your logs. I looked into the issue with the help of Chesnay. Essentially, it's FLINK-27354 [1] that is causing this issue. We couldn't come up with a reason why it should have popped up just now with 1.15. The bug itself is already present in 1.14. You can find more

Re: Jobmanager trying to be registered for Zombie Job

2022-04-25 Thread Matthias Pohl
Thanks Peter, we're looking into it... On Mon, Apr 25, 2022 at 11:54 AM Peter Schrott wrote: > Hi, > > sorry for the late reply. It took me quite some time to get the logs out > of the system. I have attached them now. > > Its logs of 2 jobmanagers and 2 taskamangers. It can be seen on jm 1

Re: Jobmanager trying to be registered for Zombie Job

2022-04-22 Thread Matthias Pohl
FYI: I created FLINK-27354 [1] to cover the issue of retrying to connect to the RM while shutting down the JobMaster. This doesn't explain your issue though, Peter. It's still unclear why the JobMaster is still around as stated in my previous email. Matthias [1]

Re: Jobmanager trying to be registered for Zombie Job

2022-04-22 Thread Matthias Pohl
Just by looking through the code, it appears that these logs could be produced while stopping the job. The ResourceManager sends a confirmation of the JobMaster being disconnected at the end back to the JobMaster. If the JobMaster is still around to process the request, it would try to reconnect

Re: Jobmanager trying to be registered for Zombie Job

2022-04-22 Thread Matthias Pohl
...if possible it would be good to get debug rather than only info logs. Did you encounter anything odd in the TaskManager logs as well. Sharing those might be of value as well. On Fri, Apr 22, 2022 at 8:57 AM Matthias Pohl wrote: > Hi Peter, > thanks for sharing. That doesn't sound right. May

Re: Jobmanager trying to be registered for Zombie Job

2022-04-22 Thread Matthias Pohl
Hi Peter, thanks for sharing. That doesn't sound right. May you provide the entire jobmanager logs? Best, Matthias On Thu, Apr 21, 2022 at 6:08 PM Peter Schrott wrote: > Hi Flink-Users, > > I am not sure if this does something to my cluster or not. But since > updating to Flink 1.15 (atm rc4)

Jobmanager trying to be registered for Zombie Job

2022-04-21 Thread Peter Schrott
Hi Flink-Users, I am not sure if this does something to my cluster or not. But since updating to Flink 1.15 (atm rc4) I find the following logs: INFO: Registering job manager ab7db9ff0ebd26b3b89c3e2e56684...@akka.tcp:// fl...@flink-jobmanager-xxx.com:40015/user/rpc/jobmanager_2 for job