Regarding your 2nd question, there is great article from cloudera regurding
this:

http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2.
They focus on yarn setup but the big picture applies everywere.

In general, I believe that you have to know your data in order to configure
a-priory those set of params. From my experience the more cpu's the
merrier. I have noticed that i.e if i double the cpu's the job finishes in
half the time. This effect though does not have the same analogy (double
cps - half time) after I reach a specific number of cpu's (always depending
on the data and the job's actions). So it has a lot of try and observe.

In addition, there is a tight connection between cpu's and partitions.
Cloudera's article covers this.

Regards,
Leonidas

On Thu, Dec 3, 2015 at 5:44 PM, Manolis Sifalakis1 <e...@zurich.ibm.com>
wrote:

> I have found the documentation rather poor in helping me understand the
> interplay among the following properties in spark, even more how to set
> them. So this post is sent in hope for some discussion and "enlightenment"
> on the topic
>
> Let me start by asking if I have understood well the following:
>
> - spark.driver.cores:   how many cores the driver program should occupy
> - spark.cores.max:   how many cores my app will claim for computations
> - spark.executor.cores and spark.task.cpus:   how spark.cores.max are
> allocated per JVM (executor) and per task (java thread?)
>   I.e. + spark.executor.cores:   each JVM instance (executor) should use
> that many cores
>         + spark.task.cpus: each task shoudl occupy max this # or cores
>
> If so far good, then...
>
> q1: Is spark.cores.max inclusive or not of spark.driver.cores ?
>
> q1: How should one decide statically a-priori how to distribute the
> spark.cores.max to JVMs and task ?
>
> q3: Since the set-up of cores-per-worker restricts how many cores can be
> max avail per executor and since an executor cannot spawn across workers,
> what is the rationale behind an application claiming cores
> (spark.cores.max) as opposed to merely executors ? (This will make an app
> never fail to be admitted)
>
> TIA for any clarifications/intuitions/experiences on the topic
>
> best
>
> Manolis.
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to