Running out of memory locally launching multiple spark jobs using spark yarn / submit from shell script.

Colin Kincaid Williams Sun, 17 Jan 2016 13:13:36 -0800

I launch around 30-60 of these jobs defined like start-job.sh in the
background from a wrapper script. I wait about 30 seconds between launches,
then the wrapper monitors yarn to determine when to launch more. There is a
limit defined at around 60 jobs, but even if I set it to 30, I run out of
memory on the host submitting the jobs. Why does my approach to using
spark-submit cause me to run out of memory. I have about 6G free, and I
don't feel like I should be running out of memory when submitting jobs.


start-job.sh

export HADOOP_CONF_DIR=/etc/hadoop/conf
spark-submit \
  --class sap.whcounter.WarehouseCounter \
  --master yarn-cluster \
  --num-executors 1 \
  --driver-memory 1024m \
  --executor-memory 1024m \
  --executor-cores 4 \
  --queue hlab \
  --conf spark.yarn.submit.waitAppCompletion=false \
  --conf spark.app.name=wh_reader_sp \
  --conf spark.streaming.receiver.maxRate=1000 \
  --conf spark.streaming.concurrentJobs=2 \
  --conf spark.eventLog.dir="hdfs:///user/spark/applicationHistory" \
  --conf spark.eventLog.enabled=true \
  --conf spark.eventLog.overwrite=true \
  --conf spark.yarn.historyServer.address="http://spark-history.local:18080/";
\
  --conf spark.yarn.jar="hdfs:///user/spark/share/lib/spark-assembly.jar" \
  --conf
spark.yarn.dist.files="hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar"
\
  hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar \
  $1 $2


ps aux | grep java

/usr/java/latest/bin/java -cp
::/usr/lib/spark/conf:/usr/lib/spark/lib/spark-assembly.jar:/etc/hadoop/conf:/usr/lib/hadoop/client/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop/../hadoop-hdfs/./:/usr/lib/hadoop/../hadoop-hdfs/lib/*:/usr/lib/hadoop/../hadoop-hdfs/.//*:/usr/lib/hadoop/../hadoop-yarn/lib/*:/usr/lib/hadoop/../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:/usr/lib/spark/lib/scala-library.jar:/usr/lib/spark/lib/scala-compiler.jar:/usr/lib/spark/lib/jline.jar
-XX:MaxPermSize=128m -Xms1024m -Xmx1024m
org.apache.spark.deploy.SparkSubmit --class sap.whcounter.WarehouseCounter
--master yarn-cluster --num-executors 1 --driver-memory 1024m
--executor-memory 1024m --executor-cores 4 --queue hlab --conf
spark.yarn.submit.waitAppCompletion=false --conf spark.app.name=wh_reader_sp
--conf spark.streaming.receiver.maxRate=1000 --conf
spark.streaming.concurrentJobs=2 --conf
spark.eventLog.dir=hdfs:///user/spark/applicationHistory --conf
spark.eventLog.enabled=true --conf spark.eventLog.overwrite=true --conf
spark.yarn.historyServer.address=http://spark-history.local:18080/ --conf
spark.yarn.jar=hdfs:///user/spark/share/lib/spark-assembly.jar --conf
spark.yarn.dist.files=hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar
hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar
hdfs:///wh/2015/04/19/*

free -m
             total       used       free     shared    buffers
Mem:          7873        992       6881          0         62
-/+ buffers/cache:        500       7373
Swap:        14947        574      14373


hs_err_pid7433.log

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 716177408 bytes for
committing reserved memory.
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Checes insufficient memory for the Java Runtime Environment to continue.
#   # Native memory allocation (malloc) failed to allocate 716177408 bytes
for committing reserved memory.
#   # Possible reasons:
#   #   The system is out of physical RAM or swap space
#   #   In 32 bit mode, the process size limit was hit
#   # Possible solutions:
#   #   Reduce memory load on the system
#   #   Increase physical memory or swap space
#   #   Check if swap backing store is full
#   #   Use 64 bit Java on a 64 bit OS
#   #   Decrease Java heap size (-Xmx/-Xms)
#   #   Decrease number of Java threads
#   #   Decrease Java thread stack sizes (-Xss)
#   #   Set larger code cache with -XX:ReservedCodeCacheSize=
#   # This output file may be truncated or incomplete.
#   #
#   #  Out of Memory Error (os_linux.cpp:2747), pid=7357,
tid=140414250673920
#   #
#   # JRE version:  (7.0_60-b19) (build )
#   # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.60-b09 mixed mode
linux-amd64 compressed oops)
#   # Failed to write core dump. Core dumps have been disabled. To enable
core dumping, try "ulimit -c unlimited" before starting Java again
#   if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (os_linux.cpp:2747), pid=7357, tid=140414250673920
#
# JRE version:  (7.0_60-b19) (build )
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.60-b09 mixed mode
linux-amd64 compressed oops)
# Failed to write core dump. Core dumps have been disabled. To enable core
dumping, try "ulimit -c unlimited" before starting Java again
#

VM Arguments:
jvm_args: -XX:MaxPermSize=128m -Xms1024m -Xmx1024m
java_command: org.apache.spark.deploy.SparkSubmit --class
sap.whcounter.WarehouseCounter --master yarn-cluster --num-executors 1
--driver-memory 1024m --executor-memory 1024m --executor-cores 4 --queue
hlab
--conf spark.yarn.submit.waitAppCompletion=false --conf
spark.app.name=wh_reader_sp --conf spark.streaming.receiver.maxRate=1000
--conf
spark.streaming.concurrentJobs=2 --conf
spark.eventLog.dir=hdfs:///user/spark/applicationHistory --conf
spark.eventLog.enabled=true --conf spark.eventLog.overwrite=true --conf
spark.yarn.historyServer.address=http://spark-history.local:18080/ --conf
spark.yarn.dist.files=hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar
hdfs:///user/colin.williams/warehouse-counter-0.0.1-SNAPSHOT-uber.jar
hdfs://wh/2015/04/10/* 2015-04-10T00:00:00+00:00
Launcher Type: SUN_STANDARD

Environment Variables:
JAVA_HOME=/usr/java/latest
CLASSPATH=::/usr/lib/spark/conf:/usr/lib/spark/lib/spark-assembly.jar:/etc/hadoop/conf:/usr/lib/hadoop/client/*:/etc/hadoop/conf:/usr/lib/hadoop/lib/*:/usr/lib/hadoop/.//*:/usr/lib/hadoop/../hadoop-hdfs/./:/usr/lib/hadoop/../hadoop-hdfs/lib/*:/usr/lib/hadoop/../hadoop-hdfs/.//*:/usr/lib/hadoop/../hadoop-yarn/lib/*:/usr/lib/hadoop/../hadoop-yarn/.//*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/.//*:/usr/lib/spark/lib/scala-library.jar:/usr/lib/spark/lib/scala-compiler.jar:/usr/lib/spark/lib/jline.jar
PATH=/home/colin.williams/bin:/home/colin.williams/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/rvm/bin
SHELL=/bin/bash

Running out of memory locally launching multiple spark jobs using spark yarn / submit from shell script.

Reply via email to