Re: Any ideas why a few tasks would stall

2014-12-05 Thread Andrew Or
Hi Steve et al.,

It is possible that there's just a lot of skew in your data, in which case
repartitioning is a good idea. Depending on how large your input data is
and how much skew you have, you may want to repartition to a larger number
of partitions. By the way you can just call rdd.repartition(1000); this is
the same as rdd.coalesce(1000, forceShuffle = true). Note that
repartitioning is only a good idea if your straggler task is taking a long
time. Otherwise, it can be quite expensive since it requires a full shuffle.

Another possibility is that you might just have bad nodes in your cluster.
To mitigate stragglers, you can try enabling speculative execution through
spark.speculation to true. This attempts to re-run any task that takes a
long time to complete on a different node in parallel.

-Andrew

2014-12-04 11:43 GMT-08:00 akhandeshi ami.khande...@gmail.com:

 This did not work for me.  that is, rdd.coalesce(200, forceShuffle) .  Does
 anyone have ideas on how to distribute your data evenly and co-locate
 partitions of interest?



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Any-ideas-why-a-few-tasks-would-stall-tp20207p20387.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Any ideas why a few tasks would stall

2014-12-04 Thread Ankit Soni
I ran into something similar before. 19/20 partitions would complete very 
quickly, and 1 would take the bulk of time and shuffle reads  writes. This was 
because the majority of partitions were empty, and 1 had all the data. Perhaps 
something similar is going on here - I would suggest taking a look at how much 
data each partition contains and try to achieve a roughly even distribution for 
best performance. In particular, if the RDDs are PairRDDs, partitions are 
assigned based on the hash of the key, so an even distribution of values among 
keys is required for even split of data across partitions.

On December 2, 2014 at 4:15:25 PM, Steve Lewis (lordjoe2...@gmail.com) wrote:

1) I can go there but none of the links are clickable
2) when I see something like 116/120  partitions succeeded in the stages ui in 
the storage ui I see
NOTE RDD 27 has 116 partitions cached - 4 not and those are exactly the number 
of machines which will not complete 
Also RDD 27 does not show up in the Stages UI

RDD NameStorage Level   Cached Partitions   Fraction Cached Size in 
Memory  Size in Tachyon Size on Disk
2   Memory Deserialized 1x Replicated   1   100%11.8 MB 0.0 B   
0.0 B
14  Memory Deserialized 1x Replicated   1   100%122.7 MB
0.0 B   0.0 B
7   Memory Deserialized 1x Replicated   120 100%151.1 MB
0.0 B   0.0 B
1   Memory Deserialized 1x Replicated   1   100%65.6 MB 0.0 B   
0.0 B
10  Memory Deserialized 1x Replicated   24  100%160.6 MB
0.0 B   0.0 B
27  Memory Deserialized 1x Replicated   116 97%

On Tue, Dec 2, 2014 at 3:43 PM, Sameer Farooqui same...@databricks.com wrote:
Have you tried taking thread dumps via the UI? There is a link to do so on the 
Executors' page (typically under http://driver IP:4040/exectuors.

By visualizing the thread call stack of the executors with slow running tasks, 
you can see exactly what code is executing at an instant in time. If you sample 
the executor several times in a short time period, you can identify 'hot spots' 
or expensive sections in the user code.

On Tue, Dec 2, 2014 at 3:03 PM, Steve Lewis lordjoe2...@gmail.com wrote:
 I am working on a problem which will eventually involve many millions of 
function calls. A have a small sample with several thousand calls working but 
when I try to scale up the amount of data things stall. I use 120 partitions 
and 116 finish in very little time. The remaining 4 seem to do all the work and 
stall after a fixed number (about 1000) calls and even after hours make no more 
progress.

This is my first large and complex job with spark and I would like any insight 
on how to debug  the issue or even better why it might exist. The cluster has 
15 machines and I am setting executor memory at 16G.

Also what other questions are relevant to solving the issue




--
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com



Re: Any ideas why a few tasks would stall

2014-12-04 Thread Sameer Farooqui
Good point, Ankit.

Steve - You can click on the link for '27' in the first column to get a
break down of how much data is in each of those 116 cached partitions. But
really, you want to also understand how much data is in the 4 non-cached
partitions, as they may be huge. One thing you can try doing is
.repartition() on the RDD with something like 100 partitions and then cache
this new RDD. See if that spreads the load between the partitions more
evenly.

Let us know how it goes.

On Thu, Dec 4, 2014 at 12:16 AM, Ankit Soni ankitso...@gmail.com wrote:

 I ran into something similar before. 19/20 partitions would complete very
 quickly, and 1 would take the bulk of time and shuffle reads  writes. This
 was because the majority of partitions were empty, and 1 had all the data.
 Perhaps something similar is going on here - I would suggest taking a look
 at how much data each partition contains and try to achieve a roughly even
 distribution for best performance. In particular, if the RDDs are PairRDDs,
 partitions are assigned based on the hash of the key, so an even
 distribution of values among keys is required for even split of data across
 partitions.

 On December 2, 2014 at 4:15:25 PM, Steve Lewis (lordjoe2...@gmail.com)
 wrote:

 1) I can go there but none of the links are clickable
 2) when I see something like 116/120  partitions succeeded in the stages
 ui in the storage ui I see
 NOTE RDD 27 has 116 partitions cached - 4 not and those are exactly the
 number of machines which will not complete
 Also RDD 27 does not show up in the Stages UI

RDD Name Storage Level Cached Partitions Fraction Cached Size in Memory 
 Size
 in Tachyon Size on Disk   2
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=2 Memory
 Deserialized 1x Replicated 1 100% 11.8 MB 0.0 B 0.0 B  14
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=14 Memory
 Deserialized 1x Replicated 1 100% 122.7 MB 0.0 B 0.0 B  7
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=7 Memory
 Deserialized 1x Replicated 120 100% 151.1 MB 0.0 B 0.0 B  1
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=1 Memory
 Deserialized 1x Replicated 1 100% 65.6 MB 0.0 B 0.0 B  10
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=10 Memory
 Deserialized 1x Replicated 24 100% 160.6 MB 0.0 B 0.0 B   27
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=27 Memory
 Deserialized 1x Replicated 116 97%

 On Tue, Dec 2, 2014 at 3:43 PM, Sameer Farooqui same...@databricks.com
 wrote:

 Have you tried taking thread dumps via the UI? There is a link to do so
 on the Executors' page (typically under http://driver IP:4040/exectuors.

 By visualizing the thread call stack of the executors with slow running
 tasks, you can see exactly what code is executing at an instant in time. If
 you sample the executor several times in a short time period, you can
 identify 'hot spots' or expensive sections in the user code.

 On Tue, Dec 2, 2014 at 3:03 PM, Steve Lewis lordjoe2...@gmail.com
 wrote:

   I am working on a problem which will eventually involve many millions
 of function calls. A have a small sample with several thousand calls
 working but when I try to scale up the amount of data things stall. I use
 120 partitions and 116 finish in very little time. The remaining 4 seem to
 do all the work and stall after a fixed number (about 1000) calls and even
 after hours make no more progress.

 This is my first large and complex job with spark and I would like any
 insight on how to debug  the issue or even better why it might exist. The
 cluster has 15 machines and I am setting executor memory at 16G.

 Also what other questions are relevant to solving the issue





 --
 Steven M. Lewis PhD
 4221 105th Ave NE
 Kirkland, WA 98033
 206-384-1340 (cell)
 Skype lordjoe_com




Re: Any ideas why a few tasks would stall

2014-12-04 Thread Steve Lewis
Thanks - I found the same thing -
calling
   boolean forceShuffle = true;
myRDD =   myRDD.coalesce(120,forceShuffle );
worked - there were 120 partitions but forcing a shuffle distributes the
work

I believe there is a bug in my code causing memory to accumulate as
partitions grow in size.
With a job ofer ten times larger I ran into other issues raising the number
of partitions to 10,000 -
namely too many open files

On Thu, Dec 4, 2014 at 8:32 AM, Sameer Farooqui same...@databricks.com
wrote:

 Good point, Ankit.

 Steve - You can click on the link for '27' in the first column to get a
 break down of how much data is in each of those 116 cached partitions. But
 really, you want to also understand how much data is in the 4 non-cached
 partitions, as they may be huge. One thing you can try doing is
 .repartition() on the RDD with something like 100 partitions and then cache
 this new RDD. See if that spreads the load between the partitions more
 evenly.

 Let us know how it goes.

 On Thu, Dec 4, 2014 at 12:16 AM, Ankit Soni ankitso...@gmail.com wrote:

 I ran into something similar before. 19/20 partitions would complete very
 quickly, and 1 would take the bulk of time and shuffle reads  writes. This
 was because the majority of partitions were empty, and 1 had all the data.
 Perhaps something similar is going on here - I would suggest taking a look
 at how much data each partition contains and try to achieve a roughly even
 distribution for best performance. In particular, if the RDDs are PairRDDs,
 partitions are assigned based on the hash of the key, so an even
 distribution of values among keys is required for even split of data across
 partitions.

 On December 2, 2014 at 4:15:25 PM, Steve Lewis (lordjoe2...@gmail.com)
 wrote:

 1) I can go there but none of the links are clickable
 2) when I see something like 116/120  partitions succeeded in the stages
 ui in the storage ui I see
 NOTE RDD 27 has 116 partitions cached - 4 not and those are exactly the
 number of machines which will not complete
 Also RDD 27 does not show up in the Stages UI

RDD Name Storage Level Cached Partitions Fraction Cached Size in
 Memory Size in Tachyon Size on Disk   2
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=2 Memory
 Deserialized 1x Replicated 1 100% 11.8 MB 0.0 B 0.0 B  14
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=14 Memory
 Deserialized 1x Replicated 1 100% 122.7 MB 0.0 B 0.0 B  7
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=7 Memory
 Deserialized 1x Replicated 120 100% 151.1 MB 0.0 B 0.0 B  1
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=1 Memory
 Deserialized 1x Replicated 1 100% 65.6 MB 0.0 B 0.0 B  10
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=10 Memory
 Deserialized 1x Replicated 24 100% 160.6 MB 0.0 B 0.0 B   27
 http://hwlogin.labs.uninett.no:4040/storage/rdd?id=27 Memory
 Deserialized 1x Replicated 116 97%

 On Tue, Dec 2, 2014 at 3:43 PM, Sameer Farooqui same...@databricks.com
 wrote:

 Have you tried taking thread dumps via the UI? There is a link to do so
 on the Executors' page (typically under http://driver
 IP:4040/exectuors.

 By visualizing the thread call stack of the executors with slow running
 tasks, you can see exactly what code is executing at an instant in time. If
 you sample the executor several times in a short time period, you can
 identify 'hot spots' or expensive sections in the user code.

 On Tue, Dec 2, 2014 at 3:03 PM, Steve Lewis lordjoe2...@gmail.com
 wrote:

   I am working on a problem which will eventually involve many
 millions of function calls. A have a small sample with several thousand
 calls working but when I try to scale up the amount of data things stall. I
 use 120 partitions and 116 finish in very little time. The remaining 4 seem
 to do all the work and stall after a fixed number (about 1000) calls and
 even after hours make no more progress.

 This is my first large and complex job with spark and I would like any
 insight on how to debug  the issue or even better why it might exist. The
 cluster has 15 machines and I am setting executor memory at 16G.

 Also what other questions are relevant to solving the issue





 --
 Steven M. Lewis PhD
 4221 105th Ave NE
 Kirkland, WA 98033
 206-384-1340 (cell)
 Skype lordjoe_com





-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com


Re: Any ideas why a few tasks would stall

2014-12-04 Thread akhandeshi
This did not work for me.  that is, rdd.coalesce(200, forceShuffle) .  Does
anyone have ideas on how to distribute your data evenly and co-locate
partitions of interest?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Any-ideas-why-a-few-tasks-would-stall-tp20207p20387.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Any ideas why a few tasks would stall

2014-12-02 Thread Sameer Farooqui
Have you tried taking thread dumps via the UI? There is a link to do so on
the Executors' page (typically under http://driver IP:4040/exectuors.

By visualizing the thread call stack of the executors with slow running
tasks, you can see exactly what code is executing at an instant in time. If
you sample the executor several times in a short time period, you can
identify 'hot spots' or expensive sections in the user code.

On Tue, Dec 2, 2014 at 3:03 PM, Steve Lewis lordjoe2...@gmail.com wrote:

  I am working on a problem which will eventually involve many millions of
 function calls. A have a small sample with several thousand calls working
 but when I try to scale up the amount of data things stall. I use 120
 partitions and 116 finish in very little time. The remaining 4 seem to do
 all the work and stall after a fixed number (about 1000) calls and even
 after hours make no more progress.

 This is my first large and complex job with spark and I would like any
 insight on how to debug  the issue or even better why it might exist. The
 cluster has 15 machines and I am setting executor memory at 16G.

 Also what other questions are relevant to solving the issue



Re: Any ideas why a few tasks would stall

2014-12-02 Thread Steve Lewis
1) I can go there but none of the links are clickable
2) when I see something like 116/120  partitions succeeded in the stages ui
in the storage ui I see
NOTE RDD 27 has 116 partitions cached - 4 not and those are exactly the
number of machines which will not complete
Also RDD 27 does not show up in the Stages UI

RDD NameStorage LevelCached PartitionsFraction CachedSize in MemorySize in
TachyonSize on Disk2
http://hwlogin.labs.uninett.no:4040/storage/rdd?id=2Memory
Deserialized 1x Replicated1100%11.8 MB0.0 B0.0 B14
http://hwlogin.labs.uninett.no:4040/storage/rdd?id=14Memory Deserialized
1x Replicated1100%122.7 MB0.0 B0.0 B7
http://hwlogin.labs.uninett.no:4040/storage/rdd?id=7Memory Deserialized
1x Replicated120100%151.1 MB0.0 B0.0 B1
http://hwlogin.labs.uninett.no:4040/storage/rdd?id=1Memory Deserialized
1x Replicated1100%65.6 MB0.0 B0.0 B10
http://hwlogin.labs.uninett.no:4040/storage/rdd?id=10Memory Deserialized
1x Replicated24100%160.6 MB0.0 B0.0 B27
http://hwlogin.labs.uninett.no:4040/storage/rdd?id=27Memory Deserialized
1x Replicated11697%

On Tue, Dec 2, 2014 at 3:43 PM, Sameer Farooqui same...@databricks.com
wrote:

 Have you tried taking thread dumps via the UI? There is a link to do so on
 the Executors' page (typically under http://driver IP:4040/exectuors.

 By visualizing the thread call stack of the executors with slow running
 tasks, you can see exactly what code is executing at an instant in time. If
 you sample the executor several times in a short time period, you can
 identify 'hot spots' or expensive sections in the user code.

 On Tue, Dec 2, 2014 at 3:03 PM, Steve Lewis lordjoe2...@gmail.com wrote:

  I am working on a problem which will eventually involve many millions of
 function calls. A have a small sample with several thousand calls working
 but when I try to scale up the amount of data things stall. I use 120
 partitions and 116 finish in very little time. The remaining 4 seem to do
 all the work and stall after a fixed number (about 1000) calls and even
 after hours make no more progress.

 This is my first large and complex job with spark and I would like any
 insight on how to debug  the issue or even better why it might exist. The
 cluster has 15 machines and I am setting executor memory at 16G.

 Also what other questions are relevant to solving the issue





-- 
Steven M. Lewis PhD
4221 105th Ave NE
Kirkland, WA 98033
206-384-1340 (cell)
Skype lordjoe_com