Shannon Carey created FLINK-4418: ------------------------------------ Summary: ClusterClient/ConnectionUtils#findConnectingAddress fails immediately if InetAddress.getLocalHost throws exception Key: FLINK-4418 URL: https://issues.apache.org/jira/browse/FLINK-4418 Project: Flink Issue Type: Bug Components: Client Affects Versions: 1.1.0 Reporter: Shannon Carey
When attempting to connect to a cluster with a ClusterClient, if the machine's hostname is not resolvable to an IP, an exception is thrown preventing success. This is the case if, for example, the hostname is not present & mapped to a local IP in /etc/hosts. The exception is below. I suggest that findAddressUsingStrategy() should catch java.net.UnknownHostException thrown by InetAddress.getLocalHost() and return null, allowing alternative strategies to be attempted by findConnectingAddress(). I will open a PR to this effect. Ideally this could be included in both 1.2 and 1.1.2. {code} 21:11:35 org.apache.flink.client.program.ProgramInvocationException: Failed to retrieve the JobManager gateway. 21:11:35 at org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:430) 21:11:35 at org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:90) 21:11:35 at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:389) 21:11:35 at org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:75) 21:11:35 at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:334) 21:11:35 at com.expedia.www.flink.job.scheduler.FlinkJobSubmitter.get(FlinkJobSubmitter.java:81) 21:11:35 at com.expedia.www.flink.job.scheduler.streaming.StreamingJobManager.run(StreamingJobManager.java:105) 21:11:35 at com.expedia.www.flink.job.scheduler.JobScheduler.runStreamingApp(JobScheduler.java:69) 21:11:35 at com.expedia.www.flink.job.scheduler.JobScheduler.main(JobScheduler.java:34) 21:11:35 Caused by: java.lang.RuntimeException: Failed to resolve JobManager address at /10.2.89.80:43126 21:11:35 at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:189) 21:11:35 at org.apache.flink.client.program.ClusterClient.getJobManagerGateway(ClusterClient.java:649) 21:11:35 at org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:428) 21:11:35 ... 8 more 21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47: ip-10-2-64-47: unknown error 21:11:35 at java.net.InetAddress.getLocalHost(InetAddress.java:1505) 21:11:35 at org.apache.flink.runtime.net.ConnectionUtils.findAddressUsingStrategy(ConnectionUtils.java:232) 21:11:35 at org.apache.flink.runtime.net.ConnectionUtils.findConnectingAddress(ConnectionUtils.java:123) 21:11:35 at org.apache.flink.client.program.ClusterClient$LazyActorSystemLoader.get(ClusterClient.java:187) 21:11:35 ... 10 more 21:11:35 Caused by: java.net.UnknownHostException: ip-10-2-64-47: unknown error 21:11:35 at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) 21:11:35 at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) 21:11:35 at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) 21:11:35 at java.net.InetAddress.getLocalHost(InetAddress.java:1500) 21:11:35 ... 13 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)