Hi All,
I try to clarify some behavior in the spark for executor. Because I am from Hadoop background, so I try to compare it to the Mapper (or reducer) in hadoop. 1, Each node can have multiple executors, each run in its own process? This is same as mapper process. 2, I thought the spark executor will use multi-thread mode when there are more than 1 core to allocate to it (for example: set executor-cores to 5). In this way, how many partition it can process? For example, if input are 20 partitions (similar as 20 split as mapper input) and we have 5 executors, each has 4 cores. Will all these partitions will be proceed as the same time (so each core process one partition) or actually one executor can only run one partition at the same time? I don't know whether my understand is correct, please suggest. BTW: In general practice, should we always try to set the executor-cores to a higher number? So we will favor 10 cores * 2 executor than 2 cores*10 executors? Any suggestion here? Thanks! Regards, Shuai