Re: Setting Optimal Number of Spark Executor Instances

Kevin Peng Wed, 15 Mar 2017 10:43:26 -0700

Mohini,

We set that parameter before we went and played with the number of
executors and that didn't seem to help at all.


Thanks,

KP

On Tue, Mar 14, 2017 at 3:37 PM, mohini kalamkar <mohini.kalam...@gmail.com>
wrote:

> Hi,
>
> try using this parameter --conf spark.sql.shuffle.partitions=1000
>
> Thanks,
> Mohini
>
> On Tue, Mar 14, 2017 at 3:30 PM, kpeng1 <kpe...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am currently on Spark 1.6 and I was doing a sql join on two tables that
>> are over 100 million rows each and I noticed that it was spawn 30000+
>> tasks
>> (this is the progress meter that we are seeing show up).  We tried to
>> coalesece, repartition and shuffle partitions to drop the number of tasks
>> down because we were getting time outs due to the number of task being
>> spawned, but those operations did not seem to reduce the number of tasks.
>> The solution we came up with was actually to set the num executors to 50
>> (--num-executors=50) and it looks like it spawned 200 active tasks, but
>> the
>> total number of tasks remained the same.  Was wondering if anyone knows
>> what
>> is going on?  Is there an optimal number of executors, I was under the
>> impression that the default dynamic allocation would pick the optimal
>> number
>> of executors for us and that this situation wouldn't happen.  Is there
>> something I am missing?
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-user-list.
>> 1001560.n3.nabble.com/Setting-Optimal-Number-of-Spark-Execut
>> or-Instances-tp28493.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>
>
>
> --
> Thanks & Regards,
> Mohini Kalamkar
> M: +1 310 567 9329 <(310)%20567-9329>
>

Re: Setting Optimal Number of Spark Executor Instances

Reply via email to