Re: The differentce between SparkSql/DataFram join and Rdd join

2015-04-08 Thread Hao Ren
Hi Michael, In fact, I find that all workers are hanging when SQL/DF join is running. So I picked the master and one of the workers. jstack is the following: Master

Re: The differentce between SparkSql/DataFram join and Rdd join

2015-04-08 Thread Michael Armbrust
I think your thread dump for the master is actually just a thread dump for SBT that is waiting on a forked driver program. ... java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7fed624ff528 (a java.lang.UNIXProcess) at

Re: The differentce between SparkSql/DataFram join and Rdd join

2015-04-07 Thread Michael Armbrust
The joins here are totally different implementations, but it is worrisome that you are seeing the SQL join hanging. Can you provide more information about the hang? jstack of the driver and a worker that is processing a task would be very useful. On Tue, Apr 7, 2015 at 8:33 AM, Hao Ren