Is it possible to jstack the executors and see where they are hanging?

On Thu, Mar 26, 2015 at 2:02 PM, Jon Chase <jon.ch...@gmail.com> wrote:

> Spark 1.3.0 on YARN (Amazon EMR), cluster of 10 m3.2xlarge (8cpu, 30GB),
> executor memory 20GB, driver memory 10GB
>
> I'm using Spark SQL, mainly via spark-shell, to query 15GB of data spread
> out over roughly 2,000 Parquet files and my queries frequently hang. Simple
> queries like "select count(*) from ..." on the entire data set work ok.
> Slightly more demanding ones with group by's and some aggregate functions
> (percentile_approx, avg, etc.) work ok as well, as long as I have some
> criteria in my where clause to keep the number of rows down.
>
> Once I hit some limit on query complexity and rows processed, my queries
> start to hang.  I've left them for up to an hour without seeing any
> progress.  No OOM's either - the job is just stuck.
>
> I've tried setting spark.sql.shuffle.partitions to 400 and even 800, but
> with the same results: usually near the end of the tasks (like 780 of 800
> complete), progress just stops:
>
> 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 788.0 in
> stage 1.0 (TID 1618) in 800 ms on
> ip-10-209-22-211.eu-west-1.compute.internal (748/800)
> 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 793.0 in
> stage 1.0 (TID 1623) in 622 ms on
> ip-10-105-12-41.eu-west-1.compute.internal (749/800)
> 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 797.0 in
> stage 1.0 (TID 1627) in 616 ms on ip-10-90-2-201.eu-west-1.compute.internal
> (750/800)
> 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 799.0 in
> stage 1.0 (TID 1629) in 611 ms on ip-10-90-2-201.eu-west-1.compute.internal
> (751/800)
> 15/03/26 20:53:29 INFO scheduler.TaskSetManager: Finished task 795.0 in
> stage 1.0 (TID 1625) in 669 ms on
> ip-10-105-12-41.eu-west-1.compute.internal (752/800)
>
> ^^^^^^^ this is where it stays forever
>
> Looking at the Spark UI, several of the executors still list active
> tasks.  I do see that the Shuffle Read for executors that don't have any
> tasks remaining is around 100MB, whereas it's more like 10MB for the
> executors that still have tasks.
>
> The first stage, mapPartitions, always completes fine.  It's the second
> stage (takeOrdered), that hangs.
>
> I've had this issue in 1.2.0 and 1.2.1 as well as 1.3.0.  I've also
> encountered it when using JSON files (instead of Parquet).
>
> Thoughts?  I'm blocked on using Spark SQL b/c most of the queries I do are
> having this issue.
>

Reply via email to