Hi Evo, Worker is the JVM and an executor runs on the JVM. And after Spark 1.4 you would be able to run multiple executors on the same JVM/worker.
https://issues.apache.org/jira/browse/SPARK-1706. Thanks Arush On Tue, May 26, 2015 at 2:54 PM, canan chen <ccn...@gmail.com> wrote: > I think the concept of task in spark should be on the same level of task > in MR. Usually in MR, we need to specify the memory the each mapper/reducer > task. And I believe executor is not a user-facing concept, it's a spark > internal concept. For spark users they don't need to know the concept of > executor, but need to know the concept of task. > > On Tue, May 26, 2015 at 5:09 PM, Evo Eftimov <evo.efti...@isecc.com> > wrote: > >> This is the first time I hear that “one can specify the RAM per task” – >> the RAM is granted per Executor (JVM). On the other hand each Task operates >> on ONE RDD Partition – so you can say that this is “the RAM allocated to >> the Task to process” – but it is still within the boundaries allocated to >> the Executor (JVM) within which the Task is running. Also while running, >> any Task like any JVM Thread can request as much additional RAM e.g. for >> new Object instances as there is available in the Executor aka JVM Heap >> >> >> >> *From:* canan chen [mailto:ccn...@gmail.com] >> *Sent:* Tuesday, May 26, 2015 9:30 AM >> *To:* Evo Eftimov >> *Cc:* user@spark.apache.org >> *Subject:* Re: How does spark manage the memory of executor with >> multiple tasks >> >> >> >> Yes, I know that one task represent a JVM thread. This is what I >> confused. Usually users want to specify the memory on task level, so how >> can I do it if task if thread level and multiple tasks runs in the same >> executor. And even I don't know how many threads there will be. Besides >> that, if one task cause OOM, it would cause other tasks in the same >> executor fail too. There's no isolation between tasks. >> >> >> >> On Tue, May 26, 2015 at 4:15 PM, Evo Eftimov <evo.efti...@isecc.com> >> wrote: >> >> An Executor is a JVM instance spawned and running on a Cluster Node >> (Server machine). Task is essentially a JVM Thread – you can have as many >> Threads as you want per JVM. You will also hear about “Executor Slots” – >> these are essentially the CPU Cores available on the machine and granted >> for use to the Executor >> >> >> >> Ps: what creates ongoing confusion here is that the Spark folks have >> “invented” their own terms to describe the design of their what is >> essentially a Distributed OO Framework facilitating Parallel Programming >> and Data Management in a Distributed Environment, BUT have not provided >> clear dictionary/explanations linking these “inventions” with standard >> concepts familiar to every Java, Scala etc developer >> >> >> >> *From:* canan chen [mailto:ccn...@gmail.com] >> *Sent:* Tuesday, May 26, 2015 9:02 AM >> *To:* user@spark.apache.org >> *Subject:* How does spark manage the memory of executor with multiple >> tasks >> >> >> >> Since spark can run multiple tasks in one executor, so I am curious to >> know how does spark manage memory across these tasks. Say if one executor >> takes 1GB memory, then if this executor can run 10 tasks simultaneously, >> then each task can consume 100MB on average. Do I understand it correctly ? >> It doesn't make sense to me that spark run multiple tasks in one executor. >> >> >> > > -- [image: Sigmoid Analytics] <http://htmlsig.com/www.sigmoidanalytics.com> *Arush Kharbanda* || Technical Teamlead ar...@sigmoidanalytics.com || www.sigmoidanalytics.com