Are you running on mesos, yarn or standalone? If you're on mesos, are you
using coarse grain or fine grained mode?

On Thu, Aug 13, 2015 at 10:13 PM, Ara Vartanian <arav...@cs.wisc.edu> wrote:

> I’m observing an unusual situation where my step duration increases as I
> add further executors to my cluster. My algorithm is fully data
> parallelizable into a map phase, followed by a reduce step at the end that
> amounts to matrix addition. So I’ve kicked a cluster of, say, 100 executors
> with 4 cores per executor and before running the algorithm I’ve
> repartitioned the RDD into 400 partitions. I can see in the Spark UI that
> each of the 400 (map) tasks takes about 2 seconds. However, the entire step
> is taking over a minute, and this is because the launch times of the tasks
> as reported in the Spark UI are staggered. For example, the first 100 might
> be launched in the same second, then another group 3 seconds later, and so
> forth (with the durations slowly expanding). With a task time of 2 seconds,
> this “launch lag” is dominating the computation time and only gets worse as
> I add nodes.
>
> Any insight on how I could go about diagnosing this would be greatly
> appreciated.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Reply via email to