Re: Some tasks are taking long time

2015-01-15 Thread Ajay Srivastava
Thanks RK. I can turn on speculative execution but I am trying to find out actual reason for delay as it happens on any node. Any idea about the stack trace in my previous mail. Regards,Ajay On Thursday, January 15, 2015 8:02 PM, RK prk...@yahoo.com.INVALID wrote: If you don't want

Re: Some tasks are taking long time

2015-01-15 Thread Nicos
Ajay, Unless we are dealing with some synchronization/conditional variable bug in Spark, try this per tuning guide: Cache Size Tuning One important configuration parameter for GC is the amount of memory that should be used for caching RDDs. By default, Spark uses 60% of the configured

Re: Some tasks are taking long time

2015-01-15 Thread Ajay Srivastava
Thanks Nicos.GC does not contribute much to the execution time of the task. I will debug it further today. Regards,Ajay On Thursday, January 15, 2015 11:55 PM, Nicos n...@hotmail.com wrote: Ajay, Unless we are dealing with some synchronization/conditional variable bug in Spark, try

Some tasks are taking long time

2015-01-15 Thread Ajay Srivastava
Hi, My spark job is taking long time. I see that some tasks are taking longer time for same amount of data and shuffle read/write. What could be the possible reasons for it ? The thread-dump sometimes show that all the tasks in an executor are waiting with following stack trace - Executor task

Re: Some tasks are taking long time

2015-01-15 Thread RK
If you don't want a few slow tasks to slow down the entire job, you can turn on speculation.  Here are the speculation settings from Spark Configuration - Spark 1.2.0 Documentation. |   | |   |   |   |   |   | | Spark Configuration - Spark 1.2.0 DocumentationSpark Configuration Spark Properties