I'm using Spark 1.5.1 When I turned on DEBUG, I don't see anything that looks useful. Other than the INFO outputs, there is a ton of RPC message related logs, and this bit:
16/01/13 05:53:43 DEBUG ClosureCleaner: +++ Cleaning closure <function1> (org.apache.spark.rdd.RDD$$anonfun$count$1) +++ 16/01/13 05:53:43 DEBUG ClosureCleaner: + declared fields: 1 16/01/13 05:53:43 DEBUG ClosureCleaner: public static final long org.apache.spark.rdd.RDD$$anonfun$count$1.serialVersionUID 16/01/13 05:53:43 DEBUG ClosureCleaner: + declared methods: 2 16/01/13 05:53:43 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.rdd.RDD$$anonfun$count$1.apply(java.lang.Object) 16/01/13 05:53:43 DEBUG ClosureCleaner: public final long org.apache.spark.rdd.RDD$$anonfun$count$1.apply(scala.collection.Iterator) 16/01/13 05:53:43 DEBUG ClosureCleaner: + inner classes: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + outer classes: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + outer objects: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure 16/01/13 05:53:43 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + there are no enclosing objects! 16/01/13 05:53:43 DEBUG ClosureCleaner: +++ closure <function1> (org.apache.spark.rdd.RDD$$anonfun$count$1) is now cleaned +++ 16/01/13 05:53:43 DEBUG ClosureCleaner: +++ Cleaning closure <function2> (org.apache.spark.SparkContext$$anonfun$runJob$5) +++ 16/01/13 05:53:43 DEBUG ClosureCleaner: + declared fields: 2 16/01/13 05:53:43 DEBUG ClosureCleaner: public static final long org.apache.spark.SparkContext$$anonfun$runJob$5.serialVersionUID 16/01/13 05:53:43 DEBUG ClosureCleaner: private final scala.Function1 org.apache.spark.SparkContext$$anonfun$runJob$5.cleanedFunc$1 16/01/13 05:53:43 DEBUG ClosureCleaner: + declared methods: 2 16/01/13 05:53:43 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$runJob$5.apply(java.lang.Object,java.lang.Object) 16/01/13 05:53:43 DEBUG ClosureCleaner: public final java.lang.Object org.apache.spark.SparkContext$$anonfun$runJob$5.apply(org.apache.spark.TaskContext,scala.collection.Iterator) 16/01/13 05:53:43 DEBUG ClosureCleaner: + inner classes: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + outer classes: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + outer objects: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + populating accessed fields because this is the starting closure 16/01/13 05:53:43 DEBUG ClosureCleaner: + fields accessed by starting closure: 0 16/01/13 05:53:43 DEBUG ClosureCleaner: + there are no enclosing objects! 16/01/13 05:53:43 DEBUG ClosureCleaner: +++ closure <function2> (org.apache.spark.SparkContext$$anonfun$runJob$5) is now cleaned +++ 16/01/13 05:53:43 INFO SparkContext: Starting job: count at transposeAvroToAvroChunks.scala:129 16/01/13 05:53:43 INFO DAGScheduler: Got job 3 (count at transposeAvroToAvroChunks.scala:129) with 928 output partitions 16/01/13 05:53:43 INFO DAGScheduler: Final stage: ResultStage 3(count at transposeAvroToAvroChunks.scala:129) On Tue, Jan 12, 2016 at 6:41 PM, Ted Yu <yuzhih...@gmail.com> wrote: > Which release of Spark are you using ? > > Can you turn on DEBUG logging to see if there is more clue ? > > Thanks > > On Tue, Jan 12, 2016 at 6:37 PM, AlexG <swift...@gmail.com> wrote: > >> I transpose a matrix (colChunkOfA) stored as a 200-by-54843210 as an >> array of >> rows in Array[Array[Float]] format into another matrix (rowChunk) also >> stored row-wise as a 54843210-by-200 Array[Array[Float]] using the >> following >> code: >> >> val rowChunk = new Array[Tuple2[Int,Array[Float]]](numCols) >> val colIndices = (0 until colChunkOfA.length).toArray >> >> (0 until numCols).foreach( rowIdx => { >> rowChunk(rowIdx) = Tuple2(rowIdx, >> colIndices.map(colChunkOfA(_)(rowIdx))) >> }) >> >> This succeeds, but the following code which attempts to turn rowChunk into >> an RDD fails silently: spark-submit just ends, and none of the executor >> logs >> indicate any errors occurring. >> >> val parallelRowChunkRDD = sc.parallelize(rowChunk).cache >> parallelRowChunkRDD.count >> >> What is the culprit here? >> >> Here is the log output starting from the count instruction: >> >> 16/01/13 02:23:38 INFO SparkContext: Starting job: count at >> transposeAvroToAvroChunks.scala:129 >> 16/01/13 02:23:38 INFO DAGScheduler: Got job 3 (count at >> transposeAvroToAvroChunks.scala:129) with 928 output partitions >> 16/01/13 02:23:38 INFO DAGScheduler: Final stage: ResultStage 3(count at >> transposeAvroToAvroChunks.scala:129) >> 16/01/13 02:23:38 INFO DAGScheduler: Parents of final stage: List() >> 16/01/13 02:23:38 INFO DAGScheduler: Missing parents: List() >> 16/01/13 02:23:38 INFO DAGScheduler: Submitting ResultStage 3 >> (ParallelCollectionRDD[2448] at parallelize at >> transposeAvroToAvroChunks.scala:128), which has no missing parents >> 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(1048) called with >> curMem=50917367, maxMem=127452201615 >> 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615 stored as values >> in >> memory (estimated size 1048.0 B, free 118.7 GB) >> 16/01/13 02:23:38 INFO MemoryStore: ensureFreeSpace(740) called with >> curMem=50918415, maxMem=127452201615 >> 16/01/13 02:23:38 INFO MemoryStore: Block broadcast_615_piece0 stored as >> bytes in memory (estimated size 740.0 B, free 118.7 GB) >> 16/01/13 02:23:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.36.112:36581 (size: 740.0 B, free: 118.7 GB) >> 16/01/13 02:23:38 INFO SparkContext: Created broadcast 615 from broadcast >> at >> DAGScheduler.scala:861 >> 16/01/13 02:23:38 INFO DAGScheduler: Submitting 928 missing tasks from >> ResultStage 3 (ParallelCollectionRDD[2448] at parallelize at >> transposeAvroToAvroChunks.scala:128) >> 16/01/13 02:23:38 INFO TaskSchedulerImpl: Adding task set 3.0 with 928 >> tasks >> 16/01/13 02:23:39 WARN TaskSetManager: Stage 3 contains a task of very >> large >> size (47027 KB). The maximum recommended task size is 100 KB. >> 16/01/13 02:23:39 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID >> 1219, 172.31.34.184, PROCESS_LOCAL, 48156290 bytes) >> ... >> 16/01/13 02:27:13 INFO TaskSetManager: Starting task 927.0 in stage 3.0 >> (TID >> 2146, 172.31.42.67, PROCESS_LOCAL, 48224789 bytes) >> 16/01/13 02:27:17 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.36.112:36581 in memory (size: 17.4 KB, free: 118.7 GB) >> 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.35.157:51059 in memory (size: 17.4 KB, free: 10.4 GB) >> 16/01/13 02:27:21 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.47.118:34888 in memory (size: 17.4 KB, free: 10.4 GB) >> 16/01/13 02:27:22 INFO BlockManagerInfo: Removed broadcast_419_piece0 on >> 172.31.38.42:48582 in memory (size: 17.4 KB, free: 10.4 GB) >> 16/01/13 02:27:38 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.41.68:59281 (size: 740.0 B, free: 10.4 GB) >> 16/01/13 02:27:55 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.47.118:59575 (size: 740.0 B, free: 10.4 GB) >> 16/01/13 02:28:47 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.40.24:55643 (size: 740.0 B, free: 10.4 GB) >> 16/01/13 02:28:49 INFO BlockManagerInfo: Added broadcast_615_piece0 in >> memory on 172.31.47.118:53671 (size: 740.0 B, free: 10.4 GB) >> >> This is the end of the log, so it looks like all 928 tasks got started, >> but >> presumably somewhere in running, they ran into an error. Nothing shows up >> in >> the executor logs. >> >> >> >> -- >> View this message in context: >> http://apache-spark-user-list.1001560.n3.nabble.com/failure-to-parallelize-an-RDD-tp25950.html >> Sent from the Apache Spark User List mailing list archive at Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >> For additional commands, e-mail: user-h...@spark.apache.org >> >> >