Looks like ApplicationMaster is killed by SIGTERM. 16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM 16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status:
This container may be killed by yarn NodeManager or other processes, you'd better check yarn log to dig out more details. Thanks Saisai On Thu, Oct 20, 2016 at 6:51 PM, Li Li <fancye...@gmail.com> wrote: > I am setting up a small yarn/spark cluster. hadoop/yarn version is > 2.7.3 and I can run wordcount map-reduce correctly in yarn. > And I am using spark-2.0.1-bin-hadoop2.7 using command: > ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class > org.apache.spark.examples.SparkPi --master yarn-client > examples/jars/spark-examples_2.11-2.0.1.jar 10000 > it fails and the first error is: > 16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered > BlockManager BlockManagerId(driver, 10.161.219.189, 39161) > 16/10/20 18:12:03 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE} > 16/10/20 18:12:12 INFO > cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster > registered as NettyRpcEndpointRef(null) > 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI > Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, > Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES -> > http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002), > /proxy/application_1476957324184_0002 > 16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter: > org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter > 16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: > SchedulerBackend is ready for scheduling beginning after waiting > maxRegisteredResourcesWaitingTime: 30000(ms) > 16/10/20 18:12:12 WARN spark.SparkContext: Use an existing > SparkContext, some configuration may not take effect. > 16/10/20 18:12:12 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@489091bd{/SQL,null,AVAILABLE} > 16/10/20 18:12:12 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@1de9b505{/SQL/json,null,AVAILABLE} > 16/10/20 18:12:12 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@378f002a{/SQL/execution,null,AVAILABLE} > 16/10/20 18:12:12 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@2cc75074{/SQL/execution/json,null,AVAILABLE} > 16/10/20 18:12:12 INFO handler.ContextHandler: Started > o.s.j.s.ServletContextHandler@2d64160c{/static/sql,null,AVAILABLE} > 16/10/20 18:12:12 INFO internal.SharedState: Warehouse path is > '/home/hadoop/spark-2.0.1-bin-hadoop2.7/spark-warehouse'. > 16/10/20 18:12:13 INFO spark.SparkContext: Starting job: reduce at > SparkPi.scala:38 > 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Got job 0 (reduce at > SparkPi.scala:38) with 10000 output partitions > 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Final stage: > ResultStage 0 (reduce at SparkPi.scala:38) > 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Parents of final stage: > List() > 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Missing parents: List() > 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting ResultStage > 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no > missing parents > 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0 stored as > values in memory (estimated size 1832.0 B, free 366.3 MB) > 16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0_piece0 > stored as bytes in memory (estimated size 1169.0 B, free 366.3 MB) > 16/10/20 18:12:13 INFO storage.BlockManagerInfo: Added > broadcast_0_piece0 in memory on 10.161.219.189:39161 (size: 1169.0 B, > free: 366.3 MB) > 16/10/20 18:12:13 INFO spark.SparkContext: Created broadcast 0 from > broadcast at DAGScheduler.scala:1012 > 16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting 10000 > missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at > SparkPi.scala:34) > 16/10/20 18:12:13 INFO cluster.YarnScheduler: Adding task set 0.0 with > 10000 tasks > 16/10/20 18:12:14 ERROR cluster.YarnClientSchedulerBackend: Yarn > application has already exited with state FINISHED! > 16/10/20 18:12:14 INFO server.ServerConnector: Stopped > ServerConnector@389adf1d{HTTP/1.1}{0.0.0.0:4040} > 16/10/20 18:12:14 INFO handler.ContextHandler: Stopped > o.s.j.s.ServletContextHandler@841e575{/stages/stage/kill,null,UNAVAILABLE} > 16/10/20 18:12:14 INFO handler.ContextHandler: Stopped > o.s.j.s.ServletContextHandler@66629f63{/api,null,UNAVAILABLE} > 16/10/20 18:12:14 INFO handler.ContextHandler: Stopped > o.s.j.s.ServletContextHandler@2b62442c{/,null,UNAVAILABLE} > > > I also use yarn log to get logs from yarn(total log is very lengthy in > attachement): > 16/10/20 18:12:03 INFO yarn.ExecutorRunnable: > ============================================================ > =================== > YARN executor launch context: > env: > CLASSPATH -> > {{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_ > libs__/*<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/ > hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/ > common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$ > HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_ > HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/ > yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*< > CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/* > SPARK_LOG_URL_STDERR -> > http://ai-hz1-spark3:8042/node/containerlogs/container_ > 1476957324184_0002_01_000003/hadoop/stderr?start=-4096 > SPARK_YARN_STAGING_DIR -> > hdfs://ai-hz1-spark1/user/hadoop/.sparkStaging/ > application_1476957324184_0002 > SPARK_USER -> hadoop > SPARK_YARN_MODE -> true > SPARK_LOG_URL_STDOUT -> > http://ai-hz1-spark3:8042/node/containerlogs/container_ > 1476957324184_0002_01_000003/hadoop/stdout?start=-4096 > > command: > {{JAVA_HOME}}/bin/java -server -Xmx1024m > -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=60657' > -Dspark.yarn.app.container.log.dir=<LOG_DIR> > -XX:OnOutOfMemoryError='kill %p' > org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url > spark://CoarseGrainedScheduler@10.161.219.189:60657 --executor-id 2 > --hostname ai-hz1-spark3 --cores 1 --app-id > application_1476957324184_0002 --user-class-path file:$PWD/__app__.jar > 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr > ============================================================ > =================== > > 16/10/20 18:12:03 INFO impl.ContainerManagementProtocolProxy: Opening > proxy : ai-hz1-spark5:55857 > 16/10/20 18:12:03 INFO impl.ContainerManagementProtocolProxy: Opening > proxy : ai-hz1-spark3:51061 > 16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM > 16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status: > UNDEFINED, exitCode: 16, (reason: Shutdown hook called before final > status was reported.) > 16/10/20 18:12:04 INFO util.ShutdownHookManager: Shutdown hook called > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org >