Thank you, Tathagata. That explains. Fang, Yan yanfang...@gmail.com +1 (206) 849-4108
On Fri, Jul 11, 2014 at 7:21 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > Task slot is equivalent to core number. So one core can only run one task > at a time. > > TD > > > On Fri, Jul 11, 2014 at 1:57 PM, Yan Fang <yanfang...@gmail.com> wrote: > >> Hi Tathagata, >> >> Thank you. Is task slot equivalent to the core number? Or actually one >> core can run multiple tasks at the same time? >> >> Best, >> >> Fang, Yan >> yanfang...@gmail.com >> +1 (206) 849-4108 >> >> >> On Fri, Jul 11, 2014 at 1:45 PM, Tathagata Das < >> tathagata.das1...@gmail.com> wrote: >> >>> The same executor can be used for both receiving and processing, >>> irrespective of the deployment mode (yarn, spark standalone, etc.) It boils >>> down to the number of cores / task slots that executor has. Each receiver >>> is like a long running task, so each of them occupy a slot. If there are >>> free slots in the executor then other tasks can be run on them. >>> >>> So if you are finding that the other tasks are being run, check how many >>> cores/task slots the executor has and whether there are more task slots >>> than the number of input dstream / receivers you are launching. >>> >>> @Praveen your answers were pretty much spot on, thanks for chipping in! >>> >>> >>> >>> >>> On Fri, Jul 11, 2014 at 11:16 AM, Yan Fang <yanfang...@gmail.com> wrote: >>> >>>> Hi Praveen, >>>> >>>> Thank you for the answer. That's interesting because if I only bring up >>>> one executor for the Spark Streaming, it seems only the receiver is >>>> working, no other tasks are happening, by checking the log and UI. Maybe >>>> it's just because the receiving task eats all the resource?, not because >>>> one executor can only run one receiver? >>>> >>>> Fang, Yan >>>> yanfang...@gmail.com >>>> +1 (206) 849-4108 >>>> >>>> >>>> On Fri, Jul 11, 2014 at 6:06 AM, Praveen Seluka <psel...@qubole.com> >>>> wrote: >>>> >>>>> Here are my answers. But am just getting started with Spark Streaming >>>>> - so please correct me if am wrong. >>>>> 1) Yes >>>>> 2) Receivers will run on executors. Its actually a job thats submitted >>>>> where # of tasks equals # of receivers. An executor can actually run more >>>>> than one task at the same time. Hence you could have more number of >>>>> receivers than executors but its not recommended I think. >>>>> 3) As said in 2, the executor where receiver task is running can be >>>>> used for map/reduce tasks. In yarn-cluster mode, the driver program is >>>>> actually run as application master (lives in the first container thats >>>>> launched) and this is not an executor - hence its not used for other >>>>> operations. >>>>> 4) the driver runs in a separate container. I think the same executor >>>>> can be used for receiver and the processing task also (this part am not >>>>> very sure) >>>>> >>>>> >>>>> On Fri, Jul 11, 2014 at 12:29 AM, Yan Fang <yanfang...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I am working to improve the parallelism of the Spark Streaming >>>>>> application. But I have problem in understanding how the executors are >>>>>> used >>>>>> and the application is distributed. >>>>>> >>>>>> 1. In YARN, is one executor equal one container? >>>>>> >>>>>> 2. I saw the statement that a streaming receiver runs on one work >>>>>> machine (*"n**ote that each input DStream creates a single receiver >>>>>> (running on a worker machine) that receives a single stream of data"* >>>>>> ). Does the "work machine" mean the executor or physical machine? If >>>>>> I have more receivers than the executors, will it still work? >>>>>> >>>>>> 3. Is the executor that holds receiver also used for other >>>>>> operations, such as map, reduce, or fully occupied by the receiver? >>>>>> Similarly, if I run in yarn-cluster mode, is the executor running driver >>>>>> program used by other operations too? >>>>>> >>>>>> 4. So if I have a driver program (cluster mode) and streaming >>>>>> receiver, do I have to have at least 2 executors because the program and >>>>>> streaming receiver have to be on different executors? >>>>>> >>>>>> Thank you. Sorry for having so many questions but I do want to >>>>>> understand how the Spark Streaming distributes in order to assign >>>>>> reasonable recourse.*_* Thank you again. >>>>>> >>>>>> Best, >>>>>> >>>>>> Fang, Yan >>>>>> yanfang...@gmail.com >>>>>> +1 (206) 849-4108 >>>>>> >>>>> >>>>> >>>> >>> >> >