Thanks All!

thanks Ayan!

I did the repartition to 20 so it used all cores in the cluster and was
done in 3 minutes. seems data was skewed to this partition.



On Tue, Jul 14, 2015 at 8:05 PM, ayan guha <guha.a...@gmail.com> wrote:

> Hi
>
> As you can see, Spark has taken data locality into consideration and thus
> scheduled all tasks as node local. It is because spark could run task on a
> node where data is present, so spark went ahead and scheduled the tasks. It
> is actually good for reading. If you really want to fan out processing, you
> may do a repartition(n).
> Regarding slowness, as you can see another task has completed successfully
> in 6 mins in Excutor id 2.So it does not seem that node itself is slow. it
> is possible the computation for one node is skewed. you may want to switch
> on speculative execution to see if the same task gets completed in other
> node faster or not. If yes, then its a node issue, else, ost ikely data
> issue
>
> On Tue, Jul 14, 2015 at 11:43 PM, shahid <sha...@trialx.com> wrote:
>
>> hi
>>
>> I have a 10 node cluster  i loaded the data onto hdfs, so the no. of
>> partitions i get is 9. I am running a spark application , it gets stuck on
>> one of tasks, looking at the UI it seems application is not using all
>> nodes
>> to do calculations. attached is the screen shot of tasks, it seems tasks
>> are
>> put on each node more then once. looking at tasks 8 tasks get completed
>> under 7-8 minutes and one task takes around 30 minutes so causing the
>> delay
>> in results.
>> <
>> http://apache-spark-user-list.1001560.n3.nabble.com/file/n23824/Screen_Shot_2015-07-13_at_9.png
>> >
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/No-of-Task-vs-No-of-Executors-tp23824.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>
>
> --
> Best Regards,
> Ayan Guha
>



-- 
with Regards
Shahid Ashraf

Reply via email to