Executor vs Mapper in Hadoop

Shuai Zheng Thu, 15 Jan 2015 12:18:06 -0800

Hi All,


I try to clarify some behavior in the spark for executor. Because I am from
Hadoop background, so I try to compare it to the Mapper (or reducer) in
hadoop.

 

1, Each node can have multiple executors, each run in its own process? This
is same as mapper process. 

 

2, I thought the spark executor will use multi-thread mode when there are
more than 1 core to allocate to it (for example: set executor-cores to 5).
In this way, how many partition it can process? For example, if input are 20
partitions (similar as 20 split as mapper input) and we have 5 executors,
each has 4 cores. Will all these partitions will be proceed as the same time
(so each core process one partition) or actually one executor can only run
one partition at the same time?

 

I don't know whether my understand is correct, please suggest.

 

BTW: In general practice, should we always try to set the executor-cores to
a higher number? So we will favor 10 cores * 2 executor than 2 cores*10
executors? Any suggestion here? 

 

Thanks!

 

Regards,

 

Shuai

Executor vs Mapper in Hadoop

Reply via email to