Hi Timothy,

For your first question, you would need to look in the logs and provide
additional information about why your job is failing.  The SparkContext
shutting down could happen for a variety of reasons.

In the situation where you give more memory, but less memory overhead, and
the job completes less quickly, have you checked to see whether YARN is
killing any containers?  It could be that the job completes more slowly
because, without the memory overhead, YARN kills containers while it's
running.  So it needs to run some tasks multiple times.

-Sandy

On Sat, Aug 29, 2015 at 6:57 PM, timothy22000 <timothy22...@gmail.com>
wrote:

> I am doing some memory tuning on my Spark job on YARN and I notice
> different
> settings would give different results and affect the outcome of the Spark
> job run. However, I am confused and do not understand completely why it
> happens and would appreciate if someone can provide me with some guidance
> and explanation.
>
> I will provide some background information and describe the cases that I
> have experienced and post my questions after them below.
>
> *My environment setting were as below:*
>
>  - Memory 20G, 20 VCores per node (3 nodes in total)
>  - Hadoop 2.6.0
>  - Spark 1.4.0
>
> My code recursively filters an RDD to make it smaller (removing examples as
> part of an algorithm), then does mapToPair and collect to gather the
> results
> and save them within a list.
>
>  First Case
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 7g --executor-memory 1g --num-executors 3 --executor-cores
> 1
> --jars <jar file>`
> /
> If I run my program with any driver memory less than 11g, I will get the
> error below which is the SparkContext being stopped or a similar error
> which
> is a method being called on a stopped SparkContext. From what I have
> gathered, this is related to memory not being enough.
>
>
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/EKxQD.png
> >
>
> Second Case
>
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 7g --executor-memory 3g --num-executors 3 --executor-cores
> 1
> --jars <jar file>`/
>
> If I run the program with the same driver memory but higher executor
> memory,
> the job runs longer (about 3-4 minutes) than the first case and then it
> will
> encounter a different error from earlier which is a Container
> requesting/using more memory than allowed and is being killed because of
> that. Although I find it weird since the executor memory is increased and
> this error occurs instead of the error in the first case.
>
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/tr24f.png
> >
>
> Third Case
>
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 11g --executor-memory 1g --num-executors 3 --executor-cores
> 1 --jars <jar file>`/
>
> Any setting with driver memory greater than 10g will lead to the job being
> able to run successfully.
>
> Fourth Case
>
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 2g --executor-memory 1g --conf
> spark.yarn.executor.memoryOverhead=1024 --conf
> spark.yarn.driver.memoryOverhead=1024 --num-executors 3 --executor-cores 1
> --jars <jar file>`
> /
> The job will run successfully with this setting (driver memory 2g and
> executor memory 1g but increasing the driver memory overhead(1g) and the
> executor memory overhead(1g).
>
> Questions
>
>
>  1. Why is a different error thrown and the job runs longer (for the second
> case) between the first and second case with only the executor memory being
> increased? Are the two errors linked in some way?
>
>  2. Both the third and fourth case succeeds and I understand that it is
> because I am giving more memory which solves the memory problems. However,
> in the third case,
>
> /spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that
> YARN will create a JVM
> = 11g + (driverMemory * 0.07, with minimum of 384m)
> = 11g + 1.154g
> = 12.154g/
>
> So, from the formula, I can see that my job requires MEMORY_TOTAL of around
> 12.154g to run successfully which explains why I need more than 10g for the
> driver memory setting.
>
> But for the fourth case,
>
> /
> spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that
> YARN will create a JVM
> = 2 + (driverMemory * 0.07, with minimum of 384m)
> = 2g + 0.524g
> = 2.524g
> /
>
> It seems that just by increasing the memory overhead by a small amount of
> 1024(1g) it leads to the successful run of the job with driver memory of
> only 2g and the MEMORY_TOTAL is only 2.524g! Whereas without the overhead
> configuration, driver memory less than 11g fails but it doesn't make sense
> from the formula which is why I am confused.
>
> Why increasing the memory overhead (for both driver and executor) allows my
> job to complete successfully with a lower MEMORY_TOTAL (12.154g vs 2.524g)?
> Is there some other internal things at work here that I am missing?
>
> I would really appreciate any helped offered as it would really help with
> my
> understanding of Spark. Thanks in advance.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Effects-of-Driver-Memory-Executor-Memory-Driver-Memory-Overhead-and-Executor-Memory-Overhead-os-tp24507.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to