Re: The Processing loading of Spark streaming on YARN is not in balance

Saisai Shao Wed, 29 Apr 2015 23:50:45 -0700

>From the chart you pasted, I guess you only have one receiver with storage
level two copies, so mostly your taks are located on two executors. You
could use repartition to redistribute the data more evenly across the
executors. Also add more receiver is another solution.


2015-04-30 14:38 GMT+08:00 Kyle Lin <kylelin2...@gmail.com>:

> Hi all
>
> My environment info
> Hadoop release version: HDP 2.1
> Kakfa: 0.8.1.2.1.4.0
> Spark: 1.1.0
>
> My question:
>     I ran Spark streaming program on YARN. My Spark streaming program will
> read data from Kafka and doing some processing. But, I found there is
> always only ONE executor under processing. As following table, I had 10
> work executors, but only Executor No.5 is running now. And all RRD Blocks
> are in Exectuors No.5. For my situation, the processing looks not
> distributed. How can I make all executors runs together.
>
> Executor IDAddressRDD BlocksMemory UsedDisk UsedActive TasksFailed 
> TasksComplete
> TasksTotal TasksTask TimeInputShuffle ReadShuffle Write1slave03:3866200.0
> B / 265.4 MB0.0 B00585824.9 s0.0 B1197.0 B440.5 KB10slave05:3699200.0 B /
> 265.4 MB0.0 B00444430.9 s0.0 B0.0 B501.7 KB2slave02:4025000.0 B / 265.4 MB0.0
> B00303019.6 s0.0 B855.0 B1026.0 B3slave01:4088200.0 B / 265.4 MB0.0 B0028
> 2820.7 s0.0 B1197.0 B1026.0 B4slave04:5706800.0 B / 265.4 MB0.0 B00292920.9
> s0.0 B1368.0 B1026.0 B5slave05:40191232.9 MB / 265.4 MB0.0 B10392839295.1
> m2.7 MB261.7 KB564.4 KB6slave03:3551500.0 B / 265.4 MB0.0 B10474823.1 s0.0
> B513.0 B400.4 KB7slave02:4032500.0 B / 265.4 MB0.0 B00303020.2 s0.0 B855.0
> B1197.0 B8slave01:4860900.0 B / 265.4 MB0.0 B00282820.5 s0.0 B1363.0 B1026.0
> B9slave04:3379800.0 B / 265.4 MB0.0 B00471247128.5 m11.0 MB1468.0 KB1026.0
> B<driver>slave01:3917200.0 B / 265.1 MB0.0 B00000 ms0.0 B0.0 B0.0 B
>
> Kyle
>

Re: The Processing loading of Spark streaming on YARN is not in balance

Reply via email to