Elek, Marton created HDDS-999: --------------------------------- Summary: Make the DNS resolution in OzoneManager more resilient Key: HDDS-999 URL: https://issues.apache.org/jira/browse/HDDS-999 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Elek, Marton
If the OzoneManager is started before scm the scm dns may not be available. In this case the om should retry and re-resolve the dns, but as of now it throws an exception: {code:java} 2019-01-23 17:14:25 ERROR OzoneManager:593 - Failed to start the OzoneManager. java.net.SocketException: Call From om-0.om to null:0 failed on socket exception: java.net.SocketException: Unresolved address; For more details see: http://wiki.apache.org/hadoop/SocketException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831) at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:798) at org.apache.hadoop.ipc.Server.bind(Server.java:566) at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:1042) at org.apache.hadoop.ipc.Server.<init>(Server.java:2815) at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:994) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:421) at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342) at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804) at org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:563) at org.apache.hadoop.ozone.om.OzoneManager.getRpcServer(OzoneManager.java:927) at org.apache.hadoop.ozone.om.OzoneManager.<init>(OzoneManager.java:265) at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:674) at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:587) Caused by: java.net.SocketException: Unresolved address at sun.nio.ch.Net.translateToSocketException(Net.java:131) at sun.nio.ch.Net.translateException(Net.java:157) at sun.nio.ch.Net.translateException(Net.java:163) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76) at org.apache.hadoop.ipc.Server.bind(Server.java:549) ... 11 more Caused by: java.nio.channels.UnresolvedAddressException at sun.nio.ch.Net.checkAddress(Net.java:101) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) ... 12 more{code} It should be fixed. (See also HDDS-421 which fixed the same problem in datanode side and HDDS-907 which is the workaround while this issue is not resolved). -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org