But I still have one question. I find the task number in stage is 3. So where is this 3 from? How to increase the parallelism?
Regard, Junfeng Chen On Tue, Apr 10, 2018 at 11:31 AM, Junfeng Chen <[email protected]> wrote: > Yeah, I have increase the executor number and executor cores, and it runs > normally now. The hdp spark 2 have only 2 executor and 1 executor cores by > default. > > > Regard, > Junfeng Chen > > On Tue, Apr 10, 2018 at 10:19 AM, Saisai Shao <[email protected]> > wrote: > >> In yarn mode, only two executor are assigned to process the task, since >>> one executor can process one task only, they need 6 min in total. >>> >> >> This is not true. You should set --executor-cores/--num-executors to >> increase the task parallelism for executor. To be fair, Spark application >> should have same resources (cpu/memory) when comparing between local and >> yarn mode. >> >> 2018-04-10 10:05 GMT+08:00 Junfeng Chen <[email protected]>: >> >>> I found the potential reason. >>> >>> In local mode, all tasks in one stage runs concurrently, while tasks in >>> yarn mode runs in sequence. >>> >>> For example, in one stage, each task costs 3 mins. If in local mode, >>> they will run together, and cost 3 min in total. In yarn mode, only two >>> executor are assigned to process the task, since one executor can process >>> one task only, they need 6 min in total. >>> >>> >>> Regard, >>> Junfeng Chen >>> >>> On Mon, Apr 9, 2018 at 2:12 PM, Jörn Franke <[email protected]> >>> wrote: >>> >>>> Probably network / shuffling cost? Or broadcast variables? Can you >>>> provide more details what you do and some timings? >>>> >>>> > On 9. Apr 2018, at 07:07, Junfeng Chen <[email protected]> wrote: >>>> > >>>> > I have wrote an spark streaming application reading kafka data and >>>> convert the json data to parquet and save to hdfs. >>>> > What make me puzzled is, the processing time of app in yarn mode cost >>>> 20% to 50% more time than in local mode. My cluster have three nodes with >>>> three node managers, and all three hosts have same hardware, 40cores and >>>> 256GB memory. . >>>> > >>>> > Why? How to solve it? >>>> > >>>> > Regard, >>>> > Junfeng Chen >>>> >>> >>> >> >
