An executor is specific to a Spark application, just as a mapper is specific to a MapReduce job. So a machine will usually be running many executors, and each is a JVM.
A Mapper is single-threaded; an executor can run many tasks (possibly from different jobs within the application) at once. Yes, 5 executors with 4 cores should be able to process 20 tasks in parallel. In any normal case, you have 1 executor per machine per application. There are cases where you would make more than 1, but these are unusual. On Thu, Jan 15, 2015 at 8:16 PM, Shuai Zheng <szheng.c...@gmail.com> wrote: > Hi All, > > > > I try to clarify some behavior in the spark for executor. Because I am from > Hadoop background, so I try to compare it to the Mapper (or reducer) in > hadoop. > > > > 1, Each node can have multiple executors, each run in its own process? This > is same as mapper process. > > > > 2, I thought the spark executor will use multi-thread mode when there are > more than 1 core to allocate to it (for example: set executor-cores to 5). > In this way, how many partition it can process? For example, if input are 20 > partitions (similar as 20 split as mapper input) and we have 5 executors, > each has 4 cores. Will all these partitions will be proceed as the same time > (so each core process one partition) or actually one executor can only run > one partition at the same time? > > > > I don’t know whether my understand is correct, please suggest. > > > > BTW: In general practice, should we always try to set the executor-cores to > a higher number? So we will favor 10 cores * 2 executor than 2 cores*10 > executors? Any suggestion here? > > > > Thanks! > > > > Regards, > > > > Shuai --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org