jackielihf created SPARK-2274: --------------------------------- Summary: spark SQL query hang up sometime Key: SPARK-2274 URL: https://issues.apache.org/jira/browse/SPARK-2274 Project: Spark Issue Type: Question Environment: spark 1.0.0 Reporter: jackielihf
when I run spark SQL query, it hang up sometimes: 1) simple SQL query works, such as "select * from a left out join b on a.id=b.id" 2) BUT if it has more joins, such as "select * from a left out join b on a.id=b.id left out join c on a.id=c.id...", spark shell seem to hang up. spark shell prints: scala> hc.hql("select A.id,A.tit,A.sub_tit,B.abst,B.cont,A.aut,A.com_name,A.med_name,A.pub_dt,A.upd_time,A.ent_time,A.info_lvl,A.is_pic,A.lnk_addr,A.is_ann,A.info_open_lvl,A.keyw_name,C.typ_code as type,D.evt_dir,D.evt_st,E.secu_id,E.typ_codeii,E.exch_code,E.trd_code,E.secu_sht from txt_nws_bas_update A left outer join txt_nws_bas_txt B on A.id=B.orig_id left outer join txt_nws_typ C on A.id=C.orig_id left outer join txt_nws_secu D on A.id=D.orig_id left outer join bas_secu_info E on D.secu_id=E.secu_id where D.secu_id is not null limit 5").foreach(println) 14/06/25 13:32:25 INFO ParseDriver: Parsing command: select A.id,A.tit,A.sub_tit,B.abst,B.cont,A.aut,A.com_name,A.med_name,A.pub_dt,A.upd_time,A.ent_time,A.info_lvl,A.is_pic,A.lnk_addr,A.is_ann,A.info_open_lvl,A.keyw_name,C.typ_code as type,D.evt_dir,D.evt_st,E.secu_id,E.typ_codeii,E.exch_code,E.trd_code,E.secu_sht from txt_nws_bas_update A left outer join txt_nws_bas_txt B on A.id=B.orig_id left outer join txt_nws_typ C on A.id=C.orig_id left outer join txt_nws_secu D on A.id=D.orig_id left outer join bas_secu_info E on D.secu_id=E.secu_id where D.secu_id is not null limit 5 14/06/25 13:32:25 INFO ParseDriver: Parse Completed 14/06/25 13:32:25 INFO Analyzer: Max iterations (2) reached for batch MultiInstanceRelations 14/06/25 13:32:25 INFO Analyzer: Max iterations (2) reached for batch CaseInsensitiveAttributeReferences 14/06/25 13:32:27 INFO MemoryStore: ensureFreeSpace(220923) called with curMem=0, maxMem=311387750 14/06/25 13:32:27 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 215.7 KB, free 296.8 MB) 14/06/25 13:32:27 INFO MemoryStore: ensureFreeSpace(220971) called with curMem=220923, maxMem=311387750 14/06/25 13:32:27 INFO MemoryStore: Block broadcast_1 stored as values to memory (estimated size 215.8 KB, free 296.5 MB) 14/06/25 13:32:28 INFO MemoryStore: ensureFreeSpace(220971) called with curMem=441894, maxMem=311387750 14/06/25 13:32:28 INFO MemoryStore: Block broadcast_2 stored as values to memory (estimated size 215.8 KB, free 296.3 MB) 14/06/25 13:32:28 INFO MemoryStore: ensureFreeSpace(220971) called with curMem=662865, maxMem=311387750 14/06/25 13:32:28 INFO MemoryStore: Block broadcast_3 stored as values to memory (estimated size 215.8 KB, free 296.1 MB) 14/06/25 13:32:28 INFO MemoryStore: ensureFreeSpace(220971) called with curMem=883836, maxMem=311387750 14/06/25 13:32:28 INFO MemoryStore: Block broadcast_4 stored as values to memory (estimated size 215.8 KB, free 295.9 MB) 14/06/25 13:32:29 INFO SQLContext$$anon$1: Max iterations (2) reached for batch Add exchange 14/06/25 13:32:29 INFO SQLContext$$anon$1: Max iterations (2) reached for batch Prepare Expressions 14/06/25 13:32:30 INFO FileInputFormat: Total input paths to process : 1 14/06/25 13:32:30 INFO SparkContext: Starting job: collect at joins.scala:184 14/06/25 13:32:30 INFO DAGScheduler: Got job 0 (collect at joins.scala:184) with 2 output partitions (allowLocal=false) 14/06/25 13:32:30 INFO DAGScheduler: Final stage: Stage 0(collect at joins.scala:184) 14/06/25 13:32:30 INFO DAGScheduler: Parents of final stage: List() 14/06/25 13:32:30 INFO DAGScheduler: Missing parents: List() 14/06/25 13:32:30 INFO DAGScheduler: Submitting Stage 0 (MappedRDD[7] at map at joins.scala:184), which has no missing parents 14/06/25 13:32:30 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[7] at map at joins.scala:184) 14/06/25 13:32:30 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks 14/06/25 13:32:30 INFO TaskSetManager: Starting task 0.0:0 as TID 0 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:32:30 INFO TaskSetManager: Serialized task 0.0:0 as 4088 bytes in 3 ms 14/06/25 13:32:30 INFO TaskSetManager: Starting task 0.0:1 as TID 1 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:32:30 INFO TaskSetManager: Serialized task 0.0:1 as 4088 bytes in 2 ms 14/06/25 13:32:44 INFO BlockManagerInfo: Added taskresult_1 in memory on 192.168.56.100:47102 (size: 15.0 MB, free: 281.9 MB) 14/06/25 13:32:44 INFO SendingConnection: Initiating connection to [/192.168.56.100:47102] 14/06/25 13:32:44 INFO SendingConnection: Connected to [/192.168.56.100:47102], 1 messages pending 14/06/25 13:32:44 INFO BlockManagerInfo: Added taskresult_0 in memory on 192.168.56.100:41322 (size: 15.1 MB, free: 281.9 MB) 14/06/25 13:32:44 INFO SendingConnection: Initiating connection to [/192.168.56.100:41322] 14/06/25 13:32:44 INFO SendingConnection: Connected to [/192.168.56.100:41322], 1 messages pending 14/06/25 13:32:49 INFO ConnectionManager: Accepted connection from [192.168.56.100/192.168.56.100] 14/06/25 13:32:55 INFO BlockManagerInfo: Removed taskresult_1 on 192.168.56.100:47102 in memory (size: 15.0 MB, free: 297.0 MB) 14/06/25 13:32:55 INFO DAGScheduler: Completed ResultTask(0, 1) 14/06/25 13:32:55 INFO TaskSetManager: Finished TID 1 in 24749 ms on 192.168.56.100 (progress: 1/2) 14/06/25 13:33:10 INFO ConnectionManager: Accepted connection from [192.168.56.100/192.168.56.100] 14/06/25 13:33:15 INFO BlockManagerInfo: Removed taskresult_0 on 192.168.56.100:41322 in memory (size: 15.1 MB, free: 297.0 MB) 14/06/25 13:33:15 INFO DAGScheduler: Completed ResultTask(0, 0) 14/06/25 13:33:15 INFO TaskSetManager: Finished TID 0 in 45025 ms on 192.168.56.100 (progress: 2/2) 14/06/25 13:33:15 INFO DAGScheduler: Stage 0 (collect at joins.scala:184) finished in 45.665 s 14/06/25 13:33:16 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/06/25 13:33:16 INFO SparkContext: Job finished: collect at joins.scala:184, took 45.910877269 s 14/06/25 13:33:20 INFO MemoryStore: ensureFreeSpace(133210496) called with curMem=1104807, maxMem=311387750 14/06/25 13:33:20 INFO MemoryStore: Block broadcast_5 stored as values to memory (estimated size 127.0 MB, free 168.9 MB) 14/06/25 13:33:22 INFO FileInputFormat: Total input paths to process : 1 14/06/25 13:33:22 INFO SparkContext: Starting job: collect at joins.scala:184 14/06/25 13:33:22 INFO DAGScheduler: Got job 1 (collect at joins.scala:184) with 2 output partitions (allowLocal=false) 14/06/25 13:33:22 INFO DAGScheduler: Final stage: Stage 1(collect at joins.scala:184) 14/06/25 13:33:22 INFO DAGScheduler: Parents of final stage: List() 14/06/25 13:33:22 INFO DAGScheduler: Missing parents: List() 14/06/25 13:33:22 INFO DAGScheduler: Submitting Stage 1 (MappedRDD[12] at map at joins.scala:184), which has no missing parents 14/06/25 13:33:22 INFO DAGScheduler: Submitting 2 missing tasks from Stage 1 (MappedRDD[12] at map at joins.scala:184) 14/06/25 13:33:22 INFO TaskSchedulerImpl: Adding task set 1.0 with 2 tasks 14/06/25 13:33:22 INFO TaskSetManager: Starting task 1.0:0 as TID 2 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:22 INFO TaskSetManager: Serialized task 1.0:0 as 3705 bytes in 0 ms 14/06/25 13:33:22 INFO TaskSetManager: Starting task 1.0:1 as TID 3 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:22 INFO TaskSetManager: Serialized task 1.0:1 as 3705 bytes in 0 ms 14/06/25 13:33:26 INFO DAGScheduler: Completed ResultTask(1, 1) 14/06/25 13:33:26 INFO TaskSetManager: Finished TID 3 in 4178 ms on 192.168.56.100 (progress: 1/2) 14/06/25 13:33:26 INFO DAGScheduler: Completed ResultTask(1, 0) 14/06/25 13:33:26 INFO DAGScheduler: Stage 1 (collect at joins.scala:184) finished in 4.365 s 14/06/25 13:33:26 INFO SparkContext: Job finished: collect at joins.scala:184, took 4.437216848 s 14/06/25 13:33:26 INFO TaskSetManager: Finished TID 2 in 4278 ms on 192.168.56.100 (progress: 2/2) 14/06/25 13:33:26 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 14/06/25 13:33:27 INFO MemoryStore: ensureFreeSpace(31747896) called with curMem=134315303, maxMem=311387750 14/06/25 13:33:27 INFO MemoryStore: Block broadcast_6 stored as values to memory (estimated size 30.3 MB, free 138.6 MB) 14/06/25 13:33:28 INFO FileInputFormat: Total input paths to process : 1 14/06/25 13:33:28 INFO SparkContext: Starting job: collect at joins.scala:184 14/06/25 13:33:28 INFO DAGScheduler: Got job 2 (collect at joins.scala:184) with 2 output partitions (allowLocal=false) 14/06/25 13:33:28 INFO DAGScheduler: Final stage: Stage 2(collect at joins.scala:184) 14/06/25 13:33:28 INFO DAGScheduler: Parents of final stage: List() 14/06/25 13:33:28 INFO DAGScheduler: Missing parents: List() 14/06/25 13:33:28 INFO DAGScheduler: Submitting Stage 2 (MappedRDD[17] at map at joins.scala:184), which has no missing parents 14/06/25 13:33:28 INFO DAGScheduler: Submitting 2 missing tasks from Stage 2 (MappedRDD[17] at map at joins.scala:184) 14/06/25 13:33:28 INFO TaskSchedulerImpl: Adding task set 2.0 with 2 tasks 14/06/25 13:33:28 INFO TaskSetManager: Starting task 2.0:0 as TID 4 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:28 INFO TaskSetManager: Serialized task 2.0:0 as 3627 bytes in 0 ms 14/06/25 13:33:28 INFO TaskSetManager: Starting task 2.0:1 as TID 5 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:28 INFO TaskSetManager: Serialized task 2.0:1 as 3627 bytes in 0 ms 14/06/25 13:33:30 INFO DAGScheduler: Completed ResultTask(2, 0) 14/06/25 13:33:30 INFO TaskSetManager: Finished TID 4 in 2285 ms on 192.168.56.100 (progress: 1/2) 14/06/25 13:33:30 INFO DAGScheduler: Completed ResultTask(2, 1) 14/06/25 13:33:30 INFO DAGScheduler: Stage 2 (collect at joins.scala:184) finished in 2.681 s 14/06/25 13:33:30 INFO SparkContext: Job finished: collect at joins.scala:184, took 2.743363946 s 14/06/25 13:33:30 INFO TaskSetManager: Finished TID 5 in 2610 ms on 192.168.56.100 (progress: 2/2) 14/06/25 13:33:30 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 14/06/25 13:33:33 INFO MemoryStore: ensureFreeSpace(49064048) called with curMem=166063199, maxMem=311387750 14/06/25 13:33:33 INFO MemoryStore: Block broadcast_7 stored as values to memory (estimated size 46.8 MB, free 91.8 MB) 14/06/25 13:33:34 INFO FileInputFormat: Total input paths to process : 1 14/06/25 13:33:34 INFO SparkContext: Starting job: collect at joins.scala:184 14/06/25 13:33:34 INFO DAGScheduler: Got job 3 (collect at joins.scala:184) with 2 output partitions (allowLocal=false) 14/06/25 13:33:34 INFO DAGScheduler: Final stage: Stage 3(collect at joins.scala:184) 14/06/25 13:33:34 INFO DAGScheduler: Parents of final stage: List() 14/06/25 13:33:34 INFO DAGScheduler: Missing parents: List() 14/06/25 13:33:34 INFO DAGScheduler: Submitting Stage 3 (MappedRDD[22] at map at joins.scala:184), which has no missing parents 14/06/25 13:33:34 INFO DAGScheduler: Submitting 2 missing tasks from Stage 3 (MappedRDD[22] at map at joins.scala:184) 14/06/25 13:33:34 INFO TaskSchedulerImpl: Adding task set 3.0 with 2 tasks 14/06/25 13:33:34 INFO TaskSetManager: Starting task 3.0:0 as TID 6 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:34 INFO TaskSetManager: Serialized task 3.0:0 as 3708 bytes in 1 ms 14/06/25 13:33:34 INFO TaskSetManager: Starting task 3.0:1 as TID 7 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:34 INFO TaskSetManager: Serialized task 3.0:1 as 3708 bytes in 1 ms 14/06/25 13:33:35 INFO DAGScheduler: Completed ResultTask(3, 1) 14/06/25 13:33:35 INFO TaskSetManager: Finished TID 7 in 608 ms on 192.168.56.100 (progress: 1/2) 14/06/25 13:33:35 INFO DAGScheduler: Completed ResultTask(3, 0) 14/06/25 13:33:35 INFO DAGScheduler: Stage 3 (collect at joins.scala:184) finished in 0.636 s 14/06/25 13:33:35 INFO SparkContext: Job finished: collect at joins.scala:184, took 0.647360428 s 14/06/25 13:33:35 INFO TaskSetManager: Finished TID 6 in 627 ms on 192.168.56.100 (progress: 2/2) 14/06/25 13:33:35 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 14/06/25 13:33:35 INFO MemoryStore: ensureFreeSpace(4471280) called with curMem=215127247, maxMem=311387750 14/06/25 13:33:35 INFO MemoryStore: Block broadcast_8 stored as values to memory (estimated size 4.3 MB, free 87.5 MB) 14/06/25 13:33:35 INFO FileInputFormat: Total input paths to process : 1 14/06/25 13:33:35 INFO SparkContext: Starting job: count at joins.scala:216 14/06/25 13:33:35 INFO DAGScheduler: Got job 4 (count at joins.scala:216) with 2 output partitions (allowLocal=false) 14/06/25 13:33:35 INFO DAGScheduler: Final stage: Stage 4(count at joins.scala:216) 14/06/25 13:33:35 INFO DAGScheduler: Parents of final stage: List() 14/06/25 13:33:35 INFO DAGScheduler: Missing parents: List() 14/06/25 13:33:35 INFO DAGScheduler: Submitting Stage 4 (MappedRDD[28] at map at joins.scala:214), which has no missing parents 14/06/25 13:33:36 INFO DAGScheduler: Submitting 2 missing tasks from Stage 4 (MappedRDD[28] at map at joins.scala:214) 14/06/25 13:33:36 INFO TaskSchedulerImpl: Adding task set 4.0 with 2 tasks 14/06/25 13:33:36 INFO TaskSetManager: Starting task 4.0:0 as TID 8 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:36 INFO TaskSetManager: Serialized task 4.0:0 as 4570 bytes in 0 ms 14/06/25 13:33:36 INFO TaskSetManager: Starting task 4.0:1 as TID 9 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:36 INFO TaskSetManager: Serialized task 4.0:1 as 4570 bytes in 0 ms 14/06/25 13:33:47 INFO TaskSetManager: Finished TID 9 in 10869 ms on 192.168.56.100 (progress: 1/2) 14/06/25 13:33:47 INFO DAGScheduler: Completed ResultTask(4, 1) 14/06/25 13:33:47 INFO TaskSetManager: Finished TID 8 in 11240 ms on 192.168.56.100 (progress: 2/2) 14/06/25 13:33:47 INFO TaskSchedulerImpl: Removed TaskSet 4.0, whose tasks have all completed, from pool 14/06/25 13:33:47 INFO DAGScheduler: Completed ResultTask(4, 0) 14/06/25 13:33:47 INFO DAGScheduler: Stage 4 (count at joins.scala:216) finished in 11.191 s 14/06/25 13:33:47 INFO SparkContext: Job finished: count at joins.scala:216, took 12.351801444 s 14/06/25 13:33:47 INFO SparkContext: Starting job: reduce at joins.scala:219 14/06/25 13:33:47 INFO DAGScheduler: Got job 5 (reduce at joins.scala:219) with 2 output partitions (allowLocal=false) 14/06/25 13:33:47 INFO DAGScheduler: Final stage: Stage 5(reduce at joins.scala:219) 14/06/25 13:33:47 INFO DAGScheduler: Parents of final stage: List() 14/06/25 13:33:47 INFO DAGScheduler: Missing parents: List() 14/06/25 13:33:47 INFO DAGScheduler: Submitting Stage 5 (MappedRDD[29] at map at joins.scala:219), which has no missing parents 14/06/25 13:33:47 INFO DAGScheduler: Submitting 2 missing tasks from Stage 5 (MappedRDD[29] at map at joins.scala:219) 14/06/25 13:33:47 INFO TaskSchedulerImpl: Adding task set 5.0 with 2 tasks 14/06/25 13:33:47 INFO TaskSetManager: Starting task 5.0:0 as TID 10 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:47 INFO TaskSetManager: Serialized task 5.0:0 as 4586 bytes in 0 ms 14/06/25 13:33:47 INFO TaskSetManager: Starting task 5.0:1 as TID 11 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:47 INFO TaskSetManager: Serialized task 5.0:1 as 4586 bytes in 0 ms 14/06/25 13:33:52 INFO TaskSetManager: Finished TID 10 in 4929 ms on 192.168.56.100 (progress: 1/2) 14/06/25 13:33:52 INFO DAGScheduler: Completed ResultTask(5, 0) 14/06/25 13:33:52 INFO TaskSetManager: Finished TID 11 in 5285 ms on 192.168.56.100 (progress: 2/2) 14/06/25 13:33:52 INFO TaskSchedulerImpl: Removed TaskSet 5.0, whose tasks have all completed, from pool 14/06/25 13:33:52 INFO DAGScheduler: Completed ResultTask(5, 1) 14/06/25 13:33:52 INFO DAGScheduler: Stage 5 (reduce at joins.scala:219) finished in 5.236 s 14/06/25 13:33:52 INFO SparkContext: Job finished: reduce at joins.scala:219, took 5.306035846 s 14/06/25 13:33:53 INFO SparkContext: Starting job: count at joins.scala:216 14/06/25 13:33:53 INFO DAGScheduler: Got job 6 (count at joins.scala:216) with 4 output partitions (allowLocal=false) 14/06/25 13:33:53 INFO DAGScheduler: Final stage: Stage 6(count at joins.scala:216) 14/06/25 13:33:53 INFO DAGScheduler: Parents of final stage: List() 14/06/25 13:33:53 INFO DAGScheduler: Missing parents: List() 14/06/25 13:33:53 INFO DAGScheduler: Submitting Stage 6 (MappedRDD[34] at map at joins.scala:214), which has no missing parents 14/06/25 13:33:53 INFO DAGScheduler: Submitting 4 missing tasks from Stage 6 (MappedRDD[34] at map at joins.scala:214) 14/06/25 13:33:53 INFO TaskSchedulerImpl: Adding task set 6.0 with 4 tasks 枫 2014/6/25 15:09:08 14/06/25 13:33:53 INFO TaskSetManager: Starting task 6.0:0 as TID 12 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:53 INFO TaskSetManager: Serialized task 6.0:0 as 17113 bytes in 2 ms 14/06/25 13:33:53 INFO TaskSetManager: Starting task 6.0:1 as TID 13 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:33:53 INFO TaskSetManager: Serialized task 6.0:1 as 17113 bytes in 4 ms 14/06/25 13:47:25 INFO TaskSetManager: Starting task 6.0:2 as TID 14 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:47:25 INFO TaskSetManager: Serialized task 6.0:2 as 5974 bytes in 101 ms 14/06/25 13:47:25 INFO TaskSetManager: Finished TID 12 in 812264 ms on 192.168.56.100 (progress: 1/4) 14/06/25 13:47:25 INFO DAGScheduler: Completed ResultTask(6, 0) 14/06/25 13:47:25 INFO TaskSetManager: Starting task 6.0:3 as TID 15 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 13:47:25 INFO TaskSetManager: Serialized task 6.0:3 as 5974 bytes in 6 ms 14/06/25 13:47:25 INFO TaskSetManager: Finished TID 14 in 462 ms on 192.168.56.100 (progress: 2/4) 14/06/25 13:47:25 INFO DAGScheduler: Completed ResultTask(6, 2) 14/06/25 13:47:25 INFO DAGScheduler: Completed ResultTask(6, 3) 14/06/25 13:47:25 INFO TaskSetManager: Finished TID 15 in 81 ms on 192.168.56.100 (progress: 3/4) 14/06/25 14:11:07 INFO TaskSetManager: Finished TID 13 in 2234716 ms on 192.168.56.100 (progress: 4/4) 14/06/25 14:11:07 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose tasks have all completed, from pool 14/06/25 14:11:07 INFO DAGScheduler: Completed ResultTask(6, 1) 14/06/25 14:11:07 INFO DAGScheduler: Stage 6 (count at joins.scala:216) finished in 2234.742 s 14/06/25 14:11:07 INFO SparkContext: Job finished: count at joins.scala:216, took 2234.877469214 s 14/06/25 14:11:08 INFO SparkContext: Starting job: reduce at joins.scala:219 14/06/25 14:11:08 INFO DAGScheduler: Got job 7 (reduce at joins.scala:219) with 4 output partitions (allowLocal=false) 14/06/25 14:11:08 INFO DAGScheduler: Final stage: Stage 7(reduce at joins.scala:219) 14/06/25 14:11:08 INFO DAGScheduler: Parents of final stage: List() 14/06/25 14:11:08 INFO DAGScheduler: Missing parents: List() 14/06/25 14:11:08 INFO DAGScheduler: Submitting Stage 7 (MappedRDD[35] at map at joins.scala:219), which has no missing parents 14/06/25 14:11:08 INFO DAGScheduler: Submitting 4 missing tasks from Stage 7 (MappedRDD[35] at map at joins.scala:219) 14/06/25 14:11:08 INFO TaskSchedulerImpl: Adding task set 7.0 with 4 tasks 14/06/25 14:11:08 INFO TaskSetManager: Starting task 7.0:0 as TID 16 on executor 0: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 14:11:08 INFO TaskSetManager: Serialized task 7.0:0 as 17130 bytes in 10 ms 14/06/25 14:11:08 INFO TaskSetManager: Starting task 7.0:1 as TID 17 on executor 1: 192.168.56.100 (PROCESS_LOCAL) 14/06/25 14:11:08 INFO TaskSetManager: Serialized task 7.0:1 as 17130 bytes in 4 ms -- This message was sent by Atlassian JIRA (v6.2#6252)