Hi, spark: 1.6.1 java: java 1.8_u40 I tried random forest training phase, the same code works well if with 20 trees (lower accuracy, about 68%). When trying the training phase with more tree, I set to 200 trees, it returned:
"DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, took 19.556700 s Killed" . There is no WARN or ERROR from console, the task is just stopped in the end. Any idea how to resolve it? Should the timeout parameter be set to longer regards (below is the log from console) 16/07/26 00:02:47 INFO DAGScheduler: looking for newly runnable stages 16/07/26 00:02:47 INFO DAGScheduler: running: Set() 16/07/26 00:02:47 INFO DAGScheduler: waiting: Set(ResultStage 32) 16/07/26 00:02:47 INFO DAGScheduler: failed: Set() 16/07/26 00:02:47 INFO DAGScheduler: Submitting ResultStage 32 (MapPartitionsRDD[75] at map at DecisionTree.scala:642), which has no missing parents 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48 stored as values in memory (estimated size 2.2 MB, free 18.2 MB) 16/07/26 00:02:47 INFO MemoryStore: Block broadcast_48_piece0 stored as bytes in memory (estimated size 436.9 KB, free 18.7 MB) 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in memory on x.x.x.x:35450 (size: 436.9 KB, free: 45.8 GB) 16/07/26 00:02:47 INFO SparkContext: Created broadcast 48 from broadcast at DAGScheduler.scala:1006 16/07/26 00:02:47 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 32 (MapPartitionsRDD[75] at map at DecisionTree.scala:642) 16/07/26 00:02:47 INFO TaskSchedulerImpl: Adding task set 32.0 with 4 tasks 16/07/26 00:02:47 INFO TaskSetManager: Starting task 0.0 in stage 32.0 (TID 185, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes) 16/07/26 00:02:47 INFO TaskSetManager: Starting task 1.0 in stage 32.0 (TID 186, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes) 16/07/26 00:02:47 INFO TaskSetManager: Starting task 2.0 in stage 32.0 (TID 187, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes) 16/07/26 00:02:47 INFO TaskSetManager: Starting task 3.0 in stage 32.0 (TID 188, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes) 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in memory on x.x.x.x:58784 (size: 436.9 KB, free: 5.1 GB) 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 12 to x.x.x.x:44434 16/07/26 00:02:47 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 12 is 180 bytes 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in memory on x.x.x.x:46186 (size: 436.9 KB, free: 2.2 GB) 16/07/26 00:02:47 INFO BlockManagerInfo: Added broadcast_48_piece0 in memory on x.x.x.x:50132 (size: 436.9 KB, free: 5.0 GB) 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 12 to x.x.x.x:47272 16/07/26 00:02:47 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 12 to x.x.x.x:46802 16/07/26 00:02:49 INFO TaskSetManager: Finished task 2.0 in stage 32.0 (TID 187) in 2265 ms on x.x.x.x (1/4) 16/07/26 00:02:49 INFO TaskSetManager: Finished task 1.0 in stage 32.0 (TID 186) in 2266 ms on x.x.x.x (2/4) 16/07/26 00:02:50 INFO TaskSetManager: Finished task 0.0 in stage 32.0 (TID 185) in 2794 ms on x.x.x.x (3/4) 16/07/26 00:02:50 INFO TaskSetManager: Finished task 3.0 in stage 32.0 (TID 188) in 3738 ms on x.x.x.x (4/4) 16/07/26 00:02:50 INFO TaskSchedulerImpl: Removed TaskSet 32.0, whose tasks have all completed, from pool 16/07/26 00:02:50 INFO DAGScheduler: ResultStage 32 (collectAsMap at DecisionTree.scala:651) finished in 3.738 s 16/07/26 00:02:50 INFO DAGScheduler: Job 19 finished: collectAsMap at DecisionTree.scala:651, took 19.493917 s 16/07/26 00:02:51 INFO MemoryStore: Block broadcast_49 stored as values in memory (estimated size 1053.9 KB, free 19.7 MB) 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_49_piece0 stored as bytes in memory (estimated size 626.7 KB, free 20.3 MB) 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_49_piece0 in memory on x.x.x.x:35450 (size: 626.7 KB, free: 45.8 GB) 16/07/26 00:02:52 INFO SparkContext: Created broadcast 49 from broadcast at DecisionTree.scala:601 16/07/26 00:02:52 INFO SparkContext: Starting job: collectAsMap at DecisionTree.scala:651 16/07/26 00:02:52 INFO DAGScheduler: Registering RDD 76 (mapPartitions at DecisionTree.scala:622) 16/07/26 00:02:52 INFO DAGScheduler: Got job 20 (collectAsMap at DecisionTree.scala:651) with 4 output partitions 16/07/26 00:02:52 INFO DAGScheduler: Final stage: ResultStage 34 (collectAsMap at DecisionTree.scala:651) 16/07/26 00:02:52 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 33) 16/07/26 00:02:52 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 33) 16/07/26 00:02:52 INFO DAGScheduler: Submitting ShuffleMapStage 33 (MapPartitionsRDD[76] at mapPartitions at DecisionTree.scala:622), which has no missing parents 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50 stored as values in memory (estimated size 10.0 MB, free 30.3 MB) 16/07/26 00:02:52 INFO MemoryStore: Block broadcast_50_piece0 stored as bytes in memory (estimated size 2.9 MB, free 33.2 MB) 16/07/26 00:02:52 INFO BlockManagerInfo: Added broadcast_50_piece0 in memory on x.x.x.x:35450 (size: 2.9 MB, free: 45.8 GB) 16/07/26 00:02:52 INFO SparkContext: Created broadcast 50 from broadcast at DAGScheduler.scala:1006 16/07/26 00:02:52 INFO DAGScheduler: Submitting 4 missing tasks from ShuffleMapStage 33 (MapPartitionsRDD[76] at mapPartitions at DecisionTree.scala:622) 16/07/26 00:02:52 INFO TaskSchedulerImpl: Adding task set 33.0 with 4 tasks 16/07/26 00:02:52 INFO TaskSetManager: Starting task 1.0 in stage 33.0 (TID 189, x.x.x.x, partition 1,PROCESS_LOCAL, 2333 bytes) 16/07/26 00:02:52 INFO TaskSetManager: Starting task 0.0 in stage 33.0 (TID 190, x.x.x.x, partition 0,PROCESS_LOCAL, 2333 bytes) 16/07/26 00:02:52 INFO TaskSetManager: Starting task 2.0 in stage 33.0 (TID 191, x.x.x.x, partition 2,PROCESS_LOCAL, 2333 bytes) 16/07/26 00:02:52 INFO TaskSetManager: Starting task 3.0 in stage 33.0 (TID 192, x.x.x.x, partition 3,PROCESS_LOCAL, 2333 bytes) 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in memory on x.x.x.x:58784 (size: 2.9 MB, free: 5.0 GB) 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in memory on x.x.x.x:58784 (size: 626.7 KB, free: 5.0 GB) 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in memory on x.x.x.x:46186 (size: 2.9 MB, free: 2.2 GB) 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_50_piece0 in memory on x.x.x.x:50132 (size: 2.9 MB, free: 5.0 GB) 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in memory on x.x.x.x:46186 (size: 626.7 KB, free: 2.2 GB) 16/07/26 00:02:53 INFO BlockManagerInfo: Added broadcast_49_piece0 in memory on x.x.x.x:50132 (size: 626.7 KB, free: 5.0 GB) 16/07/26 00:02:57 INFO TaskSetManager: Finished task 0.0 in stage 33.0 (TID 190) in 4212 ms on x.x.x.x (1/4) 16/07/26 00:02:57 INFO TaskSetManager: Finished task 1.0 in stage 33.0 (TID 189) in 4989 ms on x.x.x.x (2/4) 16/07/26 00:03:07 INFO TaskSetManager: Finished task 2.0 in stage 33.0 (TID 191) in 14934 ms on x.x.x.x (3/4) 16/07/26 00:03:07 INFO TaskSetManager: Finished task 3.0 in stage 33.0 (TID 192) in 15172 ms on x.x.x.x (4/4) 16/07/26 00:03:07 INFO TaskSchedulerImpl: Removed TaskSet 33.0, whose tasks have all completed, from pool 16/07/26 00:03:07 INFO DAGScheduler: ShuffleMapStage 33 (mapPartitions at DecisionTree.scala:622) finished in 15.173 s 16/07/26 00:03:07 INFO DAGScheduler: looking for newly runnable stages 16/07/26 00:03:07 INFO DAGScheduler: running: Set() 16/07/26 00:03:07 INFO DAGScheduler: waiting: Set(ResultStage 34) 16/07/26 00:03:07 INFO DAGScheduler: failed: Set() 16/07/26 00:03:07 INFO DAGScheduler: Submitting ResultStage 34 (MapPartitionsRDD[78] at map at DecisionTree.scala:642), which has no missing parents 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51 stored as values in memory (estimated size 2.2 MB, free 35.4 MB) 16/07/26 00:03:08 INFO MemoryStore: Block broadcast_51_piece0 stored as bytes in memory (estimated size 444.7 KB, free 35.8 MB) 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in memory on x.x.x.x:35450 (size: 444.7 KB, free: 45.8 GB) 16/07/26 00:03:08 INFO SparkContext: Created broadcast 51 from broadcast at DAGScheduler.scala:1006 16/07/26 00:03:08 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 34 (MapPartitionsRDD[78] at map at DecisionTree.scala:642) 16/07/26 00:03:08 INFO TaskSchedulerImpl: Adding task set 34.0 with 4 tasks 16/07/26 00:03:08 INFO TaskSetManager: Starting task 0.0 in stage 34.0 (TID 193, x.x.x.x, partition 0,NODE_LOCAL, 1956 bytes) 16/07/26 00:03:08 INFO TaskSetManager: Starting task 1.0 in stage 34.0 (TID 194, x.x.x.x, partition 1,NODE_LOCAL, 1956 bytes) 16/07/26 00:03:08 INFO TaskSetManager: Starting task 2.0 in stage 34.0 (TID 195, x.x.x.x, partition 2,NODE_LOCAL, 1956 bytes) 16/07/26 00:03:08 INFO TaskSetManager: Starting task 3.0 in stage 34.0 (TID 196, x.x.x.x, partition 3,NODE_LOCAL, 1956 bytes) 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in memory on x.x.x.x:58784 (size: 444.7 KB, free: 5.0 GB) 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 13 to x.x.x.x:44434 16/07/26 00:03:08 INFO MapOutputTrackerMaster: Size of output statuses for shuffle 13 is 180 bytes 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in memory on x.x.x.x:46186 (size: 444.7 KB, free: 2.2 GB) 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 13 to x.x.x.x:47272 16/07/26 00:03:08 INFO BlockManagerInfo: Added broadcast_51_piece0 in memory on x.x.x.x:50132 (size: 444.7 KB, free: 5.0 GB) 16/07/26 00:03:08 INFO MapOutputTrackerMasterEndpoint: Asked to send map output locations for shuffle 13 to x.x.x.x:46802 16/07/26 00:03:10 INFO TaskSetManager: Finished task 1.0 in stage 34.0 (TID 194) in 2240 ms on x.x.x.x (1/4) 16/07/26 00:03:10 INFO TaskSetManager: Finished task 0.0 in stage 34.0 (TID 193) in 2749 ms on x.x.x.x (2/4) 16/07/26 00:03:11 INFO TaskSetManager: Finished task 2.0 in stage 34.0 (TID 195) in 3818 ms on x.x.x.x (3/4) 16/07/26 00:03:11 INFO TaskSetManager: Finished task 3.0 in stage 34.0 (TID 196) in 3901 ms on x.x.x.x (4/4) 16/07/26 00:03:11 INFO DAGScheduler: ResultStage 34 (collectAsMap at DecisionTree.scala:651) finished in 3.902 s 16/07/26 00:03:11 INFO TaskSchedulerImpl: Removed TaskSet 34.0, whose tasks have all completed, from pool 16/07/26 00:03:11 INFO DAGScheduler: Job 20 finished: collectAsMap at DecisionTree.scala:651, took 19.556700 s Killed