Looks like a hostname conflict to me.
15/06/22 17:04:45 WARN Utils: Your hostname, datasci01.dev.abc.com resolves to a loopback address: 127.0.0.1; using 10.0.3.197 instead (on interface eth0) 15/06/22 17:04:45 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address Can you paste your /etc/hosts here? Thanks Best Regards On Tue, Jun 23, 2015 at 2:40 AM, roy <rp...@njit.edu> wrote: > Hi, > > Our spark job on yarn suddenly started failing silently without showing > any error following is the trace. > > > Using properties file: /usr/lib/spark/conf/spark-defaults.conf > Adding default property: > spark.serializer=org.apache.spark.serializer.KryoSerializer > Adding default property: > > spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/log4j.properties > Adding default property: spark.eventLog.enabled=true > Adding default property: spark.shuffle.service.enabled=true > Adding default property: > spark.driver.extraLibraryPath=/usr/lib/hadoop/lib/native > Adding default property: > spark.yarn.historyServer.address=http://ds-hnn002.dev.abc.com:18088 > Adding default property: > spark.yarn.am.extraLibraryPath=/usr/lib/hadoop/lib/native > Adding default property: spark.ui.showConsoleProgress=true > Adding default property: spark.shuffle.service.port=7337 > Adding default property: spark.master=yarn-client > Adding default property: > spark.executor.extraLibraryPath=/usr/lib/hadoop/lib/native > Adding default property: > spark.eventLog.dir=hdfs://magnetic-hadoop-dev/user/spark/applicationHistory > Adding default property: > > spark.yarn.jar=local:/usr/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar > Parsed arguments: > master yarn > deployMode null > executorMemory 3G > executorCores null > totalExecutorCores null > propertiesFile /usr/lib/spark/conf/spark-defaults.conf > driverMemory 4G > driverCores null > driverExtraClassPath null > driverExtraLibraryPath /usr/lib/hadoop/lib/native > driverExtraJavaOptions null > supervise false > queue null > numExecutors 30 > files null > pyFiles null > archives null > mainClass null > primaryResource > file:/home/jonathanarfa/code/updb/spark/updb2vw_testing.py > name updb2vw_testing.py > childArgs [--date 2015-05-20] > jars null > packages null > repositories null > verbose true > > Spark properties used, including those specified through > --conf and those from the properties file > /usr/lib/spark/conf/spark-defaults.conf: > spark.executor.extraLibraryPath -> /usr/lib/hadoop/lib/native > spark.yarn.jar -> > > local:/usr/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar > spark.driver.extraLibraryPath -> /usr/lib/hadoop/lib/native > spark.yarn.historyServer.address -> http://ds-hnn002.dev.abc.com:18088 > spark.yarn.am.extraLibraryPath -> /usr/lib/hadoop/lib/native > spark.eventLog.enabled -> true > spark.ui.showConsoleProgress -> true > spark.serializer -> org.apache.spark.serializer.KryoSerializer > spark.executor.extraJavaOptions -> > -Dlog4j.configuration=file:///etc/spark/log4j.properties > spark.shuffle.service.enabled -> true > spark.shuffle.service.port -> 7337 > spark.eventLog.dir -> > hdfs://magnetic-hadoop-dev/user/spark/applicationHistory > spark.master -> yarn-client > > > Main class: > org.apache.spark.deploy.PythonRunner > Arguments: > file:/home/jonathanarfa/code/updb/spark/updb2vw_testing.py > null > --date > 2015-05-20 > System properties: > spark.executor.extraLibraryPath -> /usr/lib/hadoop/lib/native > spark.driver.memory -> 4G > spark.executor.memory -> 3G > spark.yarn.jar -> > > local:/usr/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar > spark.driver.extraLibraryPath -> /usr/lib/hadoop/lib/native > spark.executor.instances -> 30 > spark.yarn.historyServer.address -> http://ds-hnn002.dev.abc.com:18088 > spark.yarn.am.extraLibraryPath -> /usr/lib/hadoop/lib/native > spark.ui.showConsoleProgress -> true > spark.eventLog.enabled -> true > spark.yarn.dist.files -> > file:/home/jonathanarfa/code/updb/spark/updb2vw_testing.py > SPARK_SUBMIT -> true > spark.serializer -> org.apache.spark.serializer.KryoSerializer > spark.executor.extraJavaOptions -> > -Dlog4j.configuration=file:///etc/spark/log4j.properties > spark.shuffle.service.enabled -> true > spark.app.name -> updb2vw_testing.py > spark.shuffle.service.port -> 7337 > spark.eventLog.dir -> > hdfs://magnetic-hadoop-dev/user/spark/applicationHistory > spark.master -> yarn-client > Classpath elements: > > > > spark.akka.frameSize=60 > spark.app.name=updb2vw_2015-05-20 > spark.driver.extraLibraryPath=/usr/lib/hadoop/lib/native > spark.driver.maxResultSize=2G > spark.driver.memory=4G > spark.eventLog.dir=hdfs://magnetic-hadoop-dev/user/spark/applicationHistory > spark.eventLog.enabled=true > > spark.executor.extraJavaOptions=-Dlog4j.configuration=file:///etc/spark/log4j.properties > spark.executor.extraLibraryPath=/usr/lib/hadoop/lib/native > spark.executor.instances=30 > spark.executor.memory=3G > spark.master=yarn-client > spark.serializer=org.apache.spark.serializer.KryoSerializer > spark.shuffle.manager=hash > spark.shuffle.service.enabled=true > spark.shuffle.service.port=7337 > spark.task.maxFailures=6 > spark.ui.showConsoleProgress=true > spark.yarn.am.extraLibraryPath=/usr/lib/hadoop/lib/native > > spark.yarn.dist.files=file:/home/jonathanarfa/code/updb/spark/updb2vw_testing.py > spark.yarn.executor.memoryOverhead=2000 > spark.yarn.historyServer.address=http://ds-hnn002.dev.abc.com:18088 > > spark.yarn.jar=local:/usr/lib/spark/assembly/lib/spark-assembly-1.3.0-cdh5.4.0-hadoop2.6.0-cdh5.4.0.jar > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > > [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > > [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > > [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > 15/06/22 17:04:45 WARN Utils: Your hostname, datasci01.dev.abc.com > resolves > to a loopback address: 127.0.0.1; using 10.0.3.197 instead (on interface > eth0) > 15/06/22 17:04:45 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to > another address > Traceback (most recent call last): > File "/home/jonathanarfa/code/updb/spark/updb2vw_testing.py", line 125, > in > <module> > spark_context = pyspark.SparkContext(conf=conf) > File "/usr/lib/spark/python/pyspark/context.py", line 111, in __init__ > conf, jsc, profiler_cls) > File "/usr/lib/spark/python/pyspark/context.py", line 159, in _do_init > self._jsc = jsc or self._initialize_context(self._conf._jconf) > File "/usr/lib/spark/python/pyspark/context.py", line 212, in > _initialize_context > return self._jvm.JavaSparkContext(jconf) > File > "/usr/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line > 701, in __call__ > File "/usr/lib/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", > line 300, in get_return_value > py4j.protocol.Py4JJavaError: An error occurred while calling > None.org.apache.spark.api.java.JavaSparkContext. > : org.apache.spark.SparkException: Yarn application has already ended! It > might have been killed or unable to launch application master. > at > > org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:113) > at > > org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:59) > at > > org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:141) > at org.apache.spark.SparkContext.<init>(SparkContext.scala:379) > at > > org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:61) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) > at > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) > at py4j.Gateway.invoke(Gateway.java:214) > at > > py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) > at > py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:745) > > 15/06/22 17:08:27 ERROR Utils: Uncaught exception in thread delete Spark > local dirs > java.lang.NullPointerException > at > org.apache.spark.storage.DiskBlockManager.org > $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) > at > org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139) > Exception in thread "delete Spark local dirs" > java.lang.NullPointerException > at > org.apache.spark.storage.DiskBlockManager.org > $apache$spark$storage$DiskBlockManager$$doStop(DiskBlockManager.scala:161) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply$mcV$sp(DiskBlockManager.scala:141) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1$$anonfun$run$1.apply(DiskBlockManager.scala:139) > at > org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617) > at > > org.apache.spark.storage.DiskBlockManager$$anon$1.run(DiskBlockManager.scala:139) > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > > [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > > [jar:file:/usr/lib/flume-ng/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > > [jar:file:/usr/lib/parquet/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] > > thanks > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-job-fails-silently-tp23436.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >