Try to set the memory size limits. For example:

./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --driver-memory 4g --executor-memory 2g --executor-cores 1 ./examples/jars/spark-examples_2.11-2.0.0.2.5.2.0-47.jar

By default yarn prefers to kill containers not only by physical but virtual memory limit.

You could also try to set

yarn.nodemanager.vmem-check-enabled

to false (yarn-site.xml)

Regards
Marton


On 10/20/16 4:02 PM, Saisai Shao wrote:
Looks like ApplicationMaster is killed by SIGTERM.

16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM
16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status:

This container may be killed by yarn NodeManager or other processes,
you'd better check yarn log to dig out more details.

Thanks
Saisai

On Thu, Oct 20, 2016 at 6:51 PM, Li Li <fancye...@gmail.com
<mailto:fancye...@gmail.com>> wrote:

    I am setting up a small yarn/spark cluster. hadoop/yarn version is
    2.7.3 and I can run wordcount map-reduce correctly in yarn.
    And I am using  spark-2.0.1-bin-hadoop2.7 using command:
    ~/spark-2.0.1-bin-hadoop2.7$ ./bin/spark-submit --class
    org.apache.spark.examples.SparkPi --master yarn-client
    examples/jars/spark-examples_2.11-2.0.1.jar 10000
    it fails and the first error is:
    16/10/20 18:12:03 INFO storage.BlockManagerMaster: Registered
    BlockManager BlockManagerId(driver, 10.161.219.189, 39161)
    16/10/20 18:12:03 INFO handler.ContextHandler: Started
    o.s.j.s.ServletContextHandler@76ad6715{/metrics/json,null,AVAILABLE}
    16/10/20 18:12:12 INFO
    cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster
    registered as NettyRpcEndpointRef(null)
    16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend: Add WebUI
    Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter,
    Map(PROXY_HOSTS -> ai-hz1-spark1, PROXY_URI_BASES ->
    http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002
    <http://ai-hz1-spark1:8088/proxy/application_1476957324184_0002>),
    /proxy/application_1476957324184_0002
    16/10/20 18:12:12 INFO ui.JettyUtils: Adding filter:
    org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
    16/10/20 18:12:12 INFO cluster.YarnClientSchedulerBackend:
    SchedulerBackend is ready for scheduling beginning after waiting
    maxRegisteredResourcesWaitingTime: 30000(ms)
    16/10/20 18:12:12 WARN spark.SparkContext: Use an existing
    SparkContext, some configuration may not take effect.
    16/10/20 18:12:12 INFO handler.ContextHandler: Started
    o.s.j.s.ServletContextHandler@489091bd{/SQL,null,AVAILABLE}
    16/10/20 18:12:12 INFO handler.ContextHandler: Started
    o.s.j.s.ServletContextHandler@1de9b505{/SQL/json,null,AVAILABLE}
    16/10/20 18:12:12 INFO handler.ContextHandler: Started
    o.s.j.s.ServletContextHandler@378f002a{/SQL/execution,null,AVAILABLE}
    16/10/20 18:12:12 INFO handler.ContextHandler: Started
    o.s.j.s.ServletContextHandler@2cc75074{/SQL/execution/json,null,AVAILABLE}
    16/10/20 18:12:12 INFO handler.ContextHandler: Started
    o.s.j.s.ServletContextHandler@2d64160c{/static/sql,null,AVAILABLE}
    16/10/20 18:12:12 INFO internal.SharedState: Warehouse path is
    '/home/hadoop/spark-2.0.1-bin-hadoop2.7/spark-warehouse'.
    16/10/20 18:12:13 INFO spark.SparkContext: Starting job: reduce at
    SparkPi.scala:38
    16/10/20 18:12:13 INFO scheduler.DAGScheduler: Got job 0 (reduce at
    SparkPi.scala:38) with 10000 output partitions
    16/10/20 18:12:13 INFO scheduler.DAGScheduler: Final stage:
    ResultStage 0 (reduce at SparkPi.scala:38)
    16/10/20 18:12:13 INFO scheduler.DAGScheduler: Parents of final
    stage: List()
    16/10/20 18:12:13 INFO scheduler.DAGScheduler: Missing parents: List()
    16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting ResultStage
    0 (MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no
    missing parents
    16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0 stored as
    values in memory (estimated size 1832.0 B, free 366.3 MB)
    16/10/20 18:12:13 INFO memory.MemoryStore: Block broadcast_0_piece0
    stored as bytes in memory (estimated size 1169.0 B, free 366.3 MB)
    16/10/20 18:12:13 INFO storage.BlockManagerInfo: Added
    broadcast_0_piece0 in memory on 10.161.219.189:39161
    <http://10.161.219.189:39161> (size: 1169.0 B,
    free: 366.3 MB)
    16/10/20 18:12:13 INFO spark.SparkContext: Created broadcast 0 from
    broadcast at DAGScheduler.scala:1012
    16/10/20 18:12:13 INFO scheduler.DAGScheduler: Submitting 10000
    missing tasks from ResultStage 0 (MapPartitionsRDD[1] at map at
    SparkPi.scala:34)
    16/10/20 18:12:13 INFO cluster.YarnScheduler: Adding task set 0.0 with
    10000 tasks
    16/10/20 18:12:14 ERROR cluster.YarnClientSchedulerBackend: Yarn
    application has already exited with state FINISHED!
    16/10/20 18:12:14 INFO server.ServerConnector: Stopped
    ServerConnector@389adf1d{HTTP/1.1}{0.0.0.0:4040 <http://0.0.0.0:4040>}
    16/10/20 18:12:14 INFO handler.ContextHandler: Stopped
    o.s.j.s.ServletContextHandler@841e575{/stages/stage/kill,null,UNAVAILABLE}
    16/10/20 18:12:14 INFO handler.ContextHandler: Stopped
    o.s.j.s.ServletContextHandler@66629f63{/api,null,UNAVAILABLE}
    16/10/20 18:12:14 INFO handler.ContextHandler: Stopped
    o.s.j.s.ServletContextHandler@2b62442c{/,null,UNAVAILABLE}


    I also use yarn log to get logs from yarn(total log is very lengthy in
    attachement):
    16/10/20 18:12:03 INFO yarn.ExecutorRunnable:
    
===============================================================================
    YARN executor launch context:
      env:
        CLASSPATH ->
    
{{PWD}}<CPS>{{PWD}}/__spark_conf__<CPS>{{PWD}}/__spark_libs__/*<CPS>$HADOOP_CONF_DIR<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/*<CPS>$HADOOP_COMMON_HOME/share/hadoop/common/lib/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/*<CPS>$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/*<CPS>$HADOOP_YARN_HOME/share/hadoop/yarn/lib/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*<CPS>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*
        SPARK_LOG_URL_STDERR ->
    
http://ai-hz1-spark3:8042/node/containerlogs/container_1476957324184_0002_01_000003/hadoop/stderr?start=-4096
    
<http://ai-hz1-spark3:8042/node/containerlogs/container_1476957324184_0002_01_000003/hadoop/stderr?start=-4096>
        SPARK_YARN_STAGING_DIR ->
    
hdfs://ai-hz1-spark1/user/hadoop/.sparkStaging/application_1476957324184_0002
        SPARK_USER -> hadoop
        SPARK_YARN_MODE -> true
        SPARK_LOG_URL_STDOUT ->
    
http://ai-hz1-spark3:8042/node/containerlogs/container_1476957324184_0002_01_000003/hadoop/stdout?start=-4096
    
<http://ai-hz1-spark3:8042/node/containerlogs/container_1476957324184_0002_01_000003/hadoop/stdout?start=-4096>

      command:
        {{JAVA_HOME}}/bin/java -server -Xmx1024m
    -Djava.io.tmpdir={{PWD}}/tmp '-Dspark.driver.port=60657'
    -Dspark.yarn.app.container.log.dir=<LOG_DIR>
    -XX:OnOutOfMemoryError='kill %p'
    org.apache.spark.executor.CoarseGrainedExecutorBackend --driver-url
    spark://CoarseGrainedScheduler@10.161.219.189:60657
    <http://CoarseGrainedScheduler@10.161.219.189:60657> --executor-id 2
    --hostname ai-hz1-spark3 --cores 1 --app-id
    application_1476957324184_0002 --user-class-path file:$PWD/__app__.jar
    1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
    
===============================================================================

    16/10/20 18:12:03 INFO impl.ContainerManagementProtocolProxy: Opening
    proxy : ai-hz1-spark5:55857
    16/10/20 18:12:03 INFO impl.ContainerManagementProtocolProxy: Opening
    proxy : ai-hz1-spark3:51061
    16/10/20 18:12:04 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM
    16/10/20 18:12:04 INFO yarn.ApplicationMaster: Final app status:
    UNDEFINED, exitCode: 16, (reason: Shutdown hook called before final
    status was reported.)
    16/10/20 18:12:04 INFO util.ShutdownHookManager: Shutdown hook called


    ---------------------------------------------------------------------
    To unsubscribe e-mail: user-unsubscr...@spark.apache.org
    <mailto:user-unsubscr...@spark.apache.org>



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to