Hi Krishna,

Thanks for your reply. I will definitely take a look at it to understand
the configuration details.

Best Regards,
Tim

On Tue, Sep 1, 2015 at 6:17 PM, Krishna Sangeeth KS <
kskrishnasange...@gmail.com> wrote:

> Hi Timothy,
>
> I think the driver memory in all your examples is more than what is
> necessary in usual cases and executor memory is quite less.
>
> I found this devops talk[1] at spark-summit here to be super useful in
> understanding few of this configuration details.
>
> [1] https://.youtube.com/watch?v=l4ZYUfZuRbU
>
> Cheers,
> Sangeeth
> On Aug 30, 2015 7:28 AM, "timothy22000" <timothy22...@gmail.com> wrote:
>
>> I am doing some memory tuning on my Spark job on YARN and I notice
>> different
>> settings would give different results and affect the outcome of the Spark
>> job run. However, I am confused and do not understand completely why it
>> happens and would appreciate if someone can provide me with some guidance
>> and explanation.
>>
>> I will provide some background information and describe the cases that I
>> have experienced and post my questions after them below.
>>
>> *My environment setting were as below:*
>>
>>  - Memory 20G, 20 VCores per node (3 nodes in total)
>>  - Hadoop 2.6.0
>>  - Spark 1.4.0
>>
>> My code recursively filters an RDD to make it smaller (removing examples
>> as
>> part of an algorithm), then does mapToPair and collect to gather the
>> results
>> and save them within a list.
>>
>>  First Case
>>
>> /`/bin/spark-submit --class <class name> --master yarn-cluster
>> --driver-memory 7g --executor-memory 1g --num-executors 3
>> --executor-cores 1
>> --jars <jar file>`
>> /
>> If I run my program with any driver memory less than 11g, I will get the
>> error below which is the SparkContext being stopped or a similar error
>> which
>> is a method being called on a stopped SparkContext. From what I have
>> gathered, this is related to memory not being enough.
>>
>>
>> <
>> http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/EKxQD.png
>> >
>>
>> Second Case
>>
>>
>> /`/bin/spark-submit --class <class name> --master yarn-cluster
>> --driver-memory 7g --executor-memory 3g --num-executors 3
>> --executor-cores 1
>> --jars <jar file>`/
>>
>> If I run the program with the same driver memory but higher executor
>> memory,
>> the job runs longer (about 3-4 minutes) than the first case and then it
>> will
>> encounter a different error from earlier which is a Container
>> requesting/using more memory than allowed and is being killed because of
>> that. Although I find it weird since the executor memory is increased and
>> this error occurs instead of the error in the first case.
>>
>> <
>> http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/tr24f.png
>> >
>>
>> Third Case
>>
>>
>> /`/bin/spark-submit --class <class name> --master yarn-cluster
>> --driver-memory 11g --executor-memory 1g --num-executors 3
>> --executor-cores
>> 1 --jars <jar file>`/
>>
>> Any setting with driver memory greater than 10g will lead to the job being
>> able to run successfully.
>>
>> Fourth Case
>>
>>
>> /`/bin/spark-submit --class <class name> --master yarn-cluster
>> --driver-memory 2g --executor-memory 1g --conf
>> spark.yarn.executor.memoryOverhead=1024 --conf
>> spark.yarn.driver.memoryOverhead=1024 --num-executors 3 --executor-cores 1
>> --jars <jar file>`
>> /
>> The job will run successfully with this setting (driver memory 2g and
>> executor memory 1g but increasing the driver memory overhead(1g) and the
>> executor memory overhead(1g).
>>
>> Questions
>>
>>
>>  1. Why is a different error thrown and the job runs longer (for the
>> second
>> case) between the first and second case with only the executor memory
>> being
>> increased? Are the two errors linked in some way?
>>
>>  2. Both the third and fourth case succeeds and I understand that it is
>> because I am giving more memory which solves the memory problems. However,
>> in the third case,
>>
>> /spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that
>> YARN will create a JVM
>> = 11g + (driverMemory * 0.07, with minimum of 384m)
>> = 11g + 1.154g
>> = 12.154g/
>>
>> So, from the formula, I can see that my job requires MEMORY_TOTAL of
>> around
>> 12.154g to run successfully which explains why I need more than 10g for
>> the
>> driver memory setting.
>>
>> But for the fourth case,
>>
>> /
>> spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that
>> YARN will create a JVM
>> = 2 + (driverMemory * 0.07, with minimum of 384m)
>> = 2g + 0.524g
>> = 2.524g
>> /
>>
>> It seems that just by increasing the memory overhead by a small amount of
>> 1024(1g) it leads to the successful run of the job with driver memory of
>> only 2g and the MEMORY_TOTAL is only 2.524g! Whereas without the overhead
>> configuration, driver memory less than 11g fails but it doesn't make sense
>> from the formula which is why I am confused.
>>
>> Why increasing the memory overhead (for both driver and executor) allows
>> my
>> job to complete successfully with a lower MEMORY_TOTAL (12.154g vs
>> 2.524g)?
>> Is there some other internal things at work here that I am missing?
>>
>> I would really appreciate any helped offered as it would really help with
>> my
>> understanding of Spark. Thanks in advance.
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Effects-of-Driver-Memory-Executor-Memory-Driver-Memory-Overhead-and-Executor-Memory-Overhead-os-tp24507.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>

Reply via email to