I have found the documentation rather poor in helping me understand the interplay among the following properties in spark, even more how to set them. So this post is sent in hope for some discussion and "enlightenment" on the topic
Let me start by asking if I have understood well the following: - spark.driver.cores: how many cores the driver program should occupy - spark.cores.max: how many cores my app will claim for computations - spark.executor.cores and spark.task.cpus: how spark.cores.max are allocated per JVM (executor) and per task (java thread?) I.e. + spark.executor.cores: each JVM instance (executor) should use that many cores + spark.task.cpus: each task shoudl occupy max this # or cores If so far good, then... q1: Is spark.cores.max inclusive or not of spark.driver.cores ? q1: How should one decide statically a-priori how to distribute the spark.cores.max to JVMs and task ? q3: Since the set-up of cores-per-worker restricts how many cores can be max avail per executor and since an executor cannot spawn across workers, what is the rationale behind an application claiming cores (spark.cores.max) as opposed to merely executors ? (This will make an app never fail to be admitted) TIA for any clarifications/intuitions/experiences on the topic best Manolis. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org