Dear Sandy, Many thanks for your reply.
I am going to respond to your replies in reverse order if you don't mind as my second question is the more pressing issue for now. In the situation where you give more memory, but less memory overhead, and > the job completes less quickly, have you checked to see whether YARN is > killing any containers? It could be that the job completes more slowly > because, without the memory overhead, YARN kills containers while it's > running. So it needs to run some tasks multiple times. I sincerely apologize if the way I structured my post was confusing, but my second question was with regards to the MEMORY_TOTAL created by YARN in the JVM and why using different settings although the MEMORY_TOTAL is different (assuming the way I calculated them is correct) would lead to the same job being ran successfully. To answer your reply above, they both ran about the same time and YARN did kill containers in both cases but that was not my question. In my 4 cases, the first and second cases are job failures where the failure in the second case ran longer before failing (leading to the first question) and the third and fourth cases are the jobs being able to complete(leading to my second question). *Second Question* I was not concerned that the job completes less quickly but I did not understand why using memoryOverhead configuration allows for a lower MEMORY_TOTAL for a successful job run. I apologize beforehand if I misunderstood your message. */bin/spark-submit --class <class name> --master yarn-cluster --driver-memory 11g --executor-memory 1g --num-executors 3 --executor-cores 1 --jars <jar file>* If I do not mess with the default memory overhead settings as above, I have to use driver memory greater than 10g for my job to run successfully. *spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory thatYARN will create a JVM= 11g + (driverMemory * 0.07, with minimum of 384m)= 11g + 1.154g= 12.154g* >From the formula above, it means that I require MEMORY_TOTAL of 12.154g for my job. However, I need less MEMORY_TOTAL when I fiddle with the memoryOverhead configuration as below: */bin/spark-submit --class <class name> --master yarn-cluster --driver-memory 2g --executor-memory 1g --conf spark.yarn.executor.memoryOverhead=1024 --conf spark.yarn.driver.memoryOverhead=1024 --num-executors 3 --executor-cores --jars <jar file>* *spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory thatYARN will create a JVM= 2 + 1024m (command line configuration)* *= 3g* I updated the second formula in the thread before I emailed you but I noticed in your reply that it had the wrong version. In this case, the memory overhead is quite close to each other but just by using the configuration for *spark.yarn.executor.memoryOverhead=1024 --conf spark.yarn.driver.memoryOverhead=1024*, my job can be completed with a MEMORY_TOTAL of 3g instead of 12.154g in the above case which was my source of confusion. Am I making a mistake somewhere? Or is using the memoryOverhead configuration setting doing something behind the scenes? I just tested this again and I always require higher driver memory whenever I am doing it as in the above case. *First Question* For your first question, you would need to look in the logs and provide > additional information about why your job is failing. The SparkContext > shutting down could happen for a variety of reasons. My first question was with regards to debugging failed Spark jobs as I always seem to receive a different error when I run the Spark job with slightly different settings. I was able to tune the Spark job so that it runs successfully by increase memory and memory overhead but I hope you can enlighten me on the nuances and difference in the error log files because they seem to always be the issue of lack of memory to me. I am sorry that there was no log file in original message. I have re-run the jobs with the same settings that led to the failed runs and I will reiterate my first question with the error logs and their respective diagnostics. I ran them around 10 times each and the error logs below are the most consistent. If I run my job with */bin/spark-submit --class <class name> --master yarn-cluster --driver-memory 7g --executor-memory 1g --num-executors 3 --executor-cores 1 --jars <jar file>, *it will give either Error Log 1 and Error Log 2. If I run my job with */bin/spark-submit --class <class name> --master yarn-cluster --driver-memory 7g --executor-memory 3g --num-executors 3 --executor-cores 1 --jars <jar file>, *it will give Error Log 3. Error Log 2 and 3 looks similar with the only difference being the diagnostics. Is there a subtle difference between the error being thrown (I know that increasing memory and memory overhead solves the issue)? Why does increase executor memory give a different type of error? Thanks again for taking the time to answer my question and I truly appreciate it. I hope you won't mind the long email as I try to provide as much info as I can! I am interested to understand Spark properly which is why I am asking the questions especially the second question above. *Error Log 1* org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1084) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1083) scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1083) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411) org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103) org.apache.spark.SparkContext.broadcast(SparkContext.scala:1282) org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:874) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply$mcVI$sp(DAGScheduler.scala:1088) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply(DAGScheduler.scala:1084) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply(DAGScheduler.scala:1084) scala.Option.foreach(Option.scala:236) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1084) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1083) scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1083) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411) org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1266) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1257) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1256) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1256) at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:884) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply$mcVI$sp(DAGScheduler.scala:1088) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply(DAGScheduler.scala:1084) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply(DAGScheduler.scala:1084) at scala.Option.foreach(Option.scala:236) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1084) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1083) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1083) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Diagnostics: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task serialization failed: java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103) org.apache.spark.SparkContext.broadcast(SparkContext.scala:1282) org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:874) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply$mcVI$sp(DAGScheduler.scala:1088) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply(DAGScheduler.scala:1084) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14$$anonfun$apply$1.apply(DAGScheduler.scala:1084) scala.Option.foreach(Option.scala:236) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1084) org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskCompletion$14.apply(DAGScheduler.scala:1083) scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1083) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447) org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1411) org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) *Error Log 2* 15/09/01 01:39:59 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM 15/09/01 01:40:00 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job cancelled because SparkContext was shut down org.apache.spark.SparkException: Job cancelled because SparkContext was shut down at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:736) at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:735) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:735) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1468) at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84) at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1403) at org.apache.spark.SparkContext.stop(SparkContext.scala:1642) at org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:559) at org.apache.spark.util.SparkShutdownHook.run(Utils.scala:2292) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2262) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2262) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.util.SparkShutdownHookManager.runAll(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anon$6.run(Utils.scala:2244) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) Diagnostics: Application application_1440667888904_0079 failed 2 times due to AM Container for appattempt_1440667888904_0079_000002 exited with exitCode: -103 For more detailed output, check application tracking page:http://cemas-1:8088/cluster/app/application_1440667888904_0079Then, click on links to logs of each attempt. Diagnostics: Container [pid=14317,containerID=container_1440667888904_0079_02_000001] is running beyond virtual memory limits. Current usage: 344.4 MB of 8 GB physical memory used; 8.0 GB of 8 GB virtual memory used. Killing container. Dump of the process-tree for container_1440667888904_0079_02_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 14322 14317 14317 14317 (java) 1398 65 8607490048 87877 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -server -Xmx7168m -Djava.io.tmpdir=/opt/hadoop/var/nm-local-dir/usercache/ts444/appcache/application_1440667888904_0079/container_1440667888904_0079_02_000001/tmp -Dspark.driver.memory=7g -Dspark.executor.memory=1g -Dspark.master=yarn-cluster -Dspark.app.name=CO880.testing.algorithm_v1.SeCo -Dspark.yarn.app.container.log.dir=/opt/hadoop/var/userlogs/application_1440667888904_0079/container_1440667888904_0079_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class CO880.testing.algorithm_v1.SeCo --jar file:/home/cuc/ts444/SeCo1.jar --arg mushroom.arff --arg mushroomtest.arff --executor-memory 1024m --executor-cores 1 --num-executors 10 |- 14317 14315 14317 14317 (bash) 0 0 12750848 302 /bin/bash -c /usr/lib/jvm/java-7-openjdk-amd64/bin/java -server -Xmx7168m -Djava.io.tmpdir=/opt/hadoop/var/nm-local-dir/usercache/ts444/appcache/application_1440667888904_0079/container_1440667888904_0079_02_000001/tmp '-Dspark.driver.memory=7g' '-Dspark.executor.memory=1g' '-Dspark.master=yarn-cluster' '-Dspark.app.name=CO880.testing.algorithm_v1.SeCo' -Dspark.yarn.app.container.log.dir=/opt/hadoop/var/userlogs/application_1440667888904_0079/container_1440667888904_0079_02_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'CO880.testing.algorithm_v1.SeCo' --jar file:/home/cuc/ts444/SeCo1.jar --arg 'mushroom.arff' --arg 'mushroomtest.arff' --executor-memory 1024m --executor-cores 1 --num-executors 10 1> /opt/hadoop/var/userlogs/application_1440667888904_0079/container_1440667888904_0079_02_000001/stdout 2> /opt/hadoop/var/userlogs/application_1440667888904_0079/container_1440667888904_0079_02_000001/stderr Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143 Failing this attempt. Failing the application. *Error Log 3* 15/09/01 01:42:19 INFO cluster.YarnClusterSchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8 15/09/01 01:42:19 INFO cluster.YarnClusterScheduler: YarnClusterScheduler.postStartHook done 15/09/01 01:42:25 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM 15/09/01 01:42:25 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Job cancelled because SparkContext was shut down org.apache.spark.SparkException: Job cancelled because SparkContext was shut down at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:736) at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:735) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:735) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1468) at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84) at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1403) at org.apache.spark.SparkContext.stop(SparkContext.scala:1642) at org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:559) at org.apache.spark.util.SparkShutdownHook.run(Utils.scala:2292) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2262) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1772) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2262) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.util.SparkShutdownHookManager.runAll(Utils.scala:2262) at org.apache.spark.util.SparkShutdownHookManager$$anon$6.run(Utils.scala:2244) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) Diagnostics:User class threw exception: org.apache.spark.SparkException: Job cancelled because SparkContext was shut down On Mon, Aug 31, 2015 at 10:03 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote: > Hi Timothy, > > For your first question, you would need to look in the logs and provide > additional information about why your job is failing. The SparkContext > shutting down could happen for a variety of reasons. > > In the situation where you give more memory, but less memory overhead, and > the job completes less quickly, have you checked to see whether YARN is > killing any containers? It could be that the job completes more slowly > because, without the memory overhead, YARN kills containers while it's > running. So it needs to run some tasks multiple times. > > -Sandy > > On Sat, Aug 29, 2015 at 6:57 PM, timothy22000 <timothy22...@gmail.com> > wrote: > >> I am doing some memory tuning on my Spark job on YARN and I notice >> different >> settings would give different results and affect the outcome of the Spark >> job run. However, I am confused and do not understand completely why it >> happens and would appreciate if someone can provide me with some guidance >> and explanation. >> >> I will provide some background information and describe the cases that I >> have experienced and post my questions after them below. >> >> *My environment setting were as below:* >> >> - Memory 20G, 20 VCores per node (3 nodes in total) >> - Hadoop 2.6.0 >> - Spark 1.4.0 >> >> My code recursively filters an RDD to make it smaller (removing examples >> as >> part of an algorithm), then does mapToPair and collect to gather the >> results >> and save them within a list. >> >> First Case >> >> /`/bin/spark-submit --class <class name> --master yarn-cluster >> --driver-memory 7g --executor-memory 1g --num-executors 3 >> --executor-cores 1 >> --jars <jar file>` >> / >> If I run my program with any driver memory less than 11g, I will get the >> error below which is the SparkContext being stopped or a similar error >> which >> is a method being called on a stopped SparkContext. From what I have >> gathered, this is related to memory not being enough. >> >> >> < >> http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/EKxQD.png >> > >> >> Second Case >> >> >> /`/bin/spark-submit --class <class name> --master yarn-cluster >> --driver-memory 7g --executor-memory 3g --num-executors 3 >> --executor-cores 1 >> --jars <jar file>`/ >> >> If I run the program with the same driver memory but higher executor >> memory, >> the job runs longer (about 3-4 minutes) than the first case and then it >> will >> encounter a different error from earlier which is a Container >> requesting/using more memory than allowed and is being killed because of >> that. Although I find it weird since the executor memory is increased and >> this error occurs instead of the error in the first case. >> >> < >> http://apache-spark-user-list.1001560.n3.nabble.com/file/n24507/tr24f.png >> > >> >> Third Case >> >> >> /`/bin/spark-submit --class <class name> --master yarn-cluster >> --driver-memory 11g --executor-memory 1g --num-executors 3 >> --executor-cores >> 1 --jars <jar file>`/ >> >> Any setting with driver memory greater than 10g will lead to the job being >> able to run successfully. >> >> Fourth Case >> >> >> /`/bin/spark-submit --class <class name> --master yarn-cluster >> --driver-memory 2g --executor-memory 1g --conf >> spark.yarn.executor.memoryOverhead=1024 --conf >> spark.yarn.driver.memoryOverhead=1024 --num-executors 3 --executor-cores 1 >> --jars <jar file>` >> / >> The job will run successfully with this setting (driver memory 2g and >> executor memory 1g but increasing the driver memory overhead(1g) and the >> executor memory overhead(1g). >> >> Questions >> >> >> 1. Why is a different error thrown and the job runs longer (for the >> second >> case) between the first and second case with only the executor memory >> being >> increased? Are the two errors linked in some way? >> >> 2. Both the third and fourth case succeeds and I understand that it is >> because I am giving more memory which solves the memory problems. However, >> in the third case, >> >> /spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that >> YARN will create a JVM >> = 11g + (driverMemory * 0.07, with minimum of 384m) >> = 11g + 1.154g >> = 12.154g/ >> >> So, from the formula, I can see that my job requires MEMORY_TOTAL of >> around >> 12.154g to run successfully which explains why I need more than 10g for >> the >> driver memory setting. >> >> But for the fourth case, >> >> / >> spark.driver.memory + spark.yarn.driver.memoryOverhead = the memory that >> YARN will create a JVM >> = 2 + (driverMemory * 0.07, with minimum of 384m) >> = 2g + 0.524g >> = 2.524g >> / >> >> It seems that just by increasing the memory overhead by a small amount of >> 1024(1g) it leads to the successful run of the job with driver memory of >> only 2g and the MEMORY_TOTAL is only 2.524g! Whereas without the overhead >> configuration, driver memory less than 11g fails but it doesn't make sense >> from the formula which is why I am confused. >> >> Why increasing the memory overhead (for both driver and executor) allows >> my >> job to complete successfully with a lower MEMORY_TOTAL (12.154g vs >> 2.524g)? >> Is there some other internal things at work here that I am missing? >> >> I would really appreciate any helped offered as it would really help with >> my >> understanding of Spark. Thanks in advance. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Effects-of-Driver-Memory-Executor-Memory-Driver-Memory-Overhead-and-Executor-Memory-Overhead-os-tp24507.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >