Hi Timothy,

I think the driver memory in all your examples is more than what is
necessary in usual cases and executor memory is quite less.

I found this devops talk[1] at spark-summit here to be super useful in
understanding few of this configuration details.

[1] https://.youtube.com/watch?v=l4ZYUfZuRbU

Cheers,
Sangeeth
On Aug 30, 2015 7:28 AM, "timothy22000" <timothy22...@gmail.com> wrote:

> I am doing some memory tuning on my Spark job on YARN and I notice
> different
> settings would give different results and affect the outcome of the Spark
> job run. However, I am confused and do not understand completely why it
> happens and would appreciate if someone can provide me with some guidance
> and explanation.
>
> I will provide some background information and describe the cases that I
> have experienced and post my questions after them below.
>
> *My environment setting were as below:*
>
>  - Memory 20G, 20 VCores per node (3 nodes in total)
>  - Hadoop 2.6.0
>  - Spark 1.4.0
>
> My code recursively filters an RDD to make it smaller (removing examples as
> part of an algorithm), then does mapToPair and collect to gather the
> results
> and save them within a list.
>
>  First Case
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 7g --executor-memory 1g --num-executors 3 --executor-cores
> 1
> --jars <jar file>`
> /
> If I run my program with any driver memory less than 11g, I will get the
> error below which is the SparkContext being stopped or a similar error
> which
> is a method being called on a stopped SparkContext. From what I have
> gathered, this is related to memory not being enough.
>
>
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/EKxQD.png
> >
>
> Second Case
>
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 7g --executor-memory 3g --num-executors 3 --executor-cores
> 1
> --jars <jar file>`/
>
> If I run the program with the same driver memory but higher executor
> memory,
> the job runs longer (about 3-4 minutes) than the first case and then it
> will
> encounter a different error from earlier which is a Container
> requesting/using more memory than allowed and is being killed because of
> that. Although I find it weird since the executor memory is increased and
> this error occurs instead of the error in the first case.
>
> <http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/tr24f.png
> >
>
> Third Case
>
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 11g --executor-memory 1g --num-executors 3 --executor-cores
> 1 --jars <jar file>`/
>
> Any setting with driver memory greater than 10g will lead to the job being
> able to run successfully.
>
> Fourth Case
>
>
> /`/bin/spark-submit --class <class name> --master yarn-cluster
> --driver-memory 2g --executor-memory 1g --conf
> spark.yarn.executor.memoryOverhead=1024 --conf
> spark.yarn.driver.memoryOverhead=1024 --num-executors 3 --executor-cores 1
> --jars <jar file>`
> /
> The job will run successfully with this setting (driver memory 2g and
> executor memory 1g but increasing the driver memory overhead(1g) and the
> executor memory overhead(1g).
>
> Questions
>
>
>  1. Why is a different error thrown and the job runs longer (for the second
> case) between the first and second case with only the executor memory being
> increased? Are the two errors linked in some way?
>
>  2. Both the third and fourth case succeeds and I understand that it is
> because I am giving more memory which solves the memory problems. However,
> in the third case,
>
> /spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that
> YARN will create a JVM
> = 11g + (driverMemory * 0.07, with minimum of 384m)
> = 11g + 1.154g
> = 12.154g/
>
> So, from the formula, I can see that my job requires MEMORY_TOTAL of around
> 12.154g to run successfully which explains why I need more than 10g for the
> driver memory setting.
>
> But for the fourth case,
>
> /
> spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that
> YARN will create a JVM
> = 2 + (driverMemory * 0.07, with minimum of 384m)
> = 2g + 0.524g
> = 2.524g
> /
>
> It seems that just by increasing the memory overhead by a small amount of
> 1024(1g) it leads to the successful run of the job with driver memory of
> only 2g and the MEMORY_TOTAL is only 2.524g! Whereas without the overhead
> configuration, driver memory less than 11g fails but it doesn't make sense
> from the formula which is why I am confused.
>
> Why increasing the memory overhead (for both driver and executor) allows my
> job to complete successfully with a lower MEMORY_TOTAL (12.154g vs 2.524g)?
> Is there some other internal things at work here that I am missing?
>
> I would really appreciate any helped offered as it would really help with
> my
> understanding of Spark. Thanks in advance.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Effects-of-Driver-Memory-Executor-Memory-Driver-Memory-Overhead-and-Executor-Memory-Overhead-os-tp24507.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to