Re: Temporary failure in name resolution on JobManager

2019-12-02 Thread David Maddison
Thanks Yang. We did try both those properties and it didn't fix it. However, we did EVENTUALLY (after some late nights!) track the issue down, not to DNS resolution but rather an obscure bug our our connector code :-( Thanks for your response, /David/ On Mon, Dec 2, 2019 at 3:16 AM Yang Wang w

Re: Temporary failure in name resolution on JobManager

2019-12-01 Thread Yang Wang
Hi David, Do you mean when the JobManager starts, the dns has some problem and the service could not be resolved? The dns restores to normal, and the JobManager jvm could not look up the dns. I think it may because the jvm dns cache. You could set the ttl and have a try. sun.net.inetaddr.ttl sun.n

Temporary failure in name resolution on JobManager

2019-11-29 Thread David Maddison
I have a Flink 1.7 cluster using the "flink:1.7.2" (OpenJDK build 1.8.0_222-b10) image on Kubernetes. As part of a MasterRestoreHook (for checkpointing) the JobManager needs to communicate with an external security service. This all works well until there's a DNS lookup failure (due to network is