Daniel Darabos created SPARK-13620: -------------------------------------- Summary: Avoid reverse DNS lookup for 0.0.0.0 on startup Key: SPARK-13620 URL: https://issues.apache.org/jira/browse/SPARK-13620 Project: Spark Issue Type: Improvement Components: Web UI Affects Versions: 1.6.0 Reporter: Daniel Darabos Priority: Minor
I noticed we spend 5+ seconds during application startup with the following stack trace: {code} at java.net.Inet6AddressImpl.getHostByAddr(Native Method) at java.net.InetAddress$1.getHostByAddr(InetAddress.java:926) at java.net.InetAddress.getHostFromNameService(InetAddress.java:611) at java.net.InetAddress.getHostName(InetAddress.java:553) at java.net.InetAddress.getHostName(InetAddress.java:525) at java.net.InetSocketAddress$InetSocketAddressHolder.getHostName(InetSocketAddress.java:82) at java.net.InetSocketAddress$InetSocketAddressHolder.access$600(InetSocketAddress.java:56) at java.net.InetSocketAddress.getHostName(InetSocketAddress.java:345) at org.spark-project.jetty.server.Server.<init>(Server.java:115) at org.apache.spark.ui.JettyUtils$.org$apache$spark$ui$JettyUtils$$connect$1(JettyUtils.scala:243) at org.apache.spark.ui.JettyUtils$$anonfun$5.apply(JettyUtils.scala:262) at org.apache.spark.ui.JettyUtils$$anonfun$5.apply(JettyUtils.scala:262) at org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1964) at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1955) at org.apache.spark.ui.JettyUtils$.startJettyServer(JettyUtils.scala:262) at org.apache.spark.ui.WebUI.bind(WebUI.scala:136) at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:481) at org.apache.spark.SparkContext$$anonfun$13.apply(SparkContext.scala:481) at scala.Option.foreach(Option.scala:236) at org.apache.spark.SparkContext.<init>(SparkContext.scala:481) {code} Spark wants to start a server on localhost. So it [creates an {{InetSocketAddress}}|https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/ui/JettyUtils.scala#L243] [with hostname {{"0.0.0.0"}}|https://github.com/apache/spark/blob/v1.6.0/core/src/main/scala/org/apache/spark/ui/WebUI.scala#L136]. Spark passes in a hostname string, but Java [recognizes that it's actually an address|https://github.com/openjdk-mirror/jdk/blob/adea42765ae4e7117c3f0e2d618d5e6aed44ced2/src/share/classes/java/net/InetSocketAddress.java#L220] and so sets the hostname to {{null}}. So when Jetty [calls {{getHostName}}|https://github.com/eclipse/jetty.project/blob/jetty-8.1.14.v20131031/jetty-server/src/main/java/org/eclipse/jetty/server/Server.java#L115] Java has to do a reverse DNS lookup for {{0.0.0.0}}. That takes 5+ seconds on my machine. Maybe it's just me? It's a very vanilla Ubuntu 14.04. There is a simple fix. Instead of passing in {{"0.0.0.0"}} we should not set a hostname. In this case [{{InetAddress.anyLocalAddress()}}|https://github.com/openjdk-mirror/jdk/blob/adea42765ae4e7117c3f0e2d618d5e6aed44ced2/src/share/classes/java/net/InetSocketAddress.java#L166] is used, which is the same, but does not need resolving. {code} scala> { val t0 = System.currentTimeMillis; new java.net.InetSocketAddress("0.0.0.0", 8000).getHostName; System.currentTimeMillis - t0 } res0: Long = 5432 scala> { val t0 = System.currentTimeMillis; new java.net.InetSocketAddress(8000).getHostName; System.currentTimeMillis - t0 } res1: Long = 0 {code} I'll send a pull request for this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org