Hi,

I'm attempting to use the distributed matrix data structure BlockMatrix
(Spark 1.5.0, scala) and having some issues when attempting to add two
block matrices together (error attached below).

I'm constructing the two matrices by creating a collection of
MatrixEntry's, putting that into CoordinateMatrix (with specifying
Nrows,Ncols), and then using the CoordinateMatrix routine
toBlockMatrix(Rpb,Cpb).  For both matrices the Rpb/Cpb's are the same.

Unfortunately when attempting to use the BlockMatrix.add routine I'm
getting:


15/11/04 10:17:27 ERROR executor.Executor: Exception in task 0.0 in stage
11.0 (TID 30)

java.lang.IllegalArgumentException: requirement failed: The last value of
colPtrs must equal the number of elements. values.length: 9164,
colPtrs.last: 5118

at scala.Predef$.require(Predef.scala:233)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:373)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:400)

at org.apache.spark.mllib.linalg.Matrices$.fromBreeze(Matrices.scala:701)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:321)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:310)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)

at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:64)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

15/11/04 10:17:27 INFO scheduler.TaskSetManager: Starting task 2.0 in stage
11.0 (TID 32, localhost, PROCESS_LOCAL, 2171 bytes)

15/11/04 10:17:27 INFO executor.Executor: Running task 2.0 in stage 11.0
(TID 32)

15/11/04 10:17:27 WARN scheduler.TaskSetManager: Lost task 0.0 in stage
11.0 (TID 30, localhost): java.lang.IllegalArgumentException: requirement
failed: The last value of colPtrs must equal the number of elements.
values.length: 9164, colPtrs.last: 5118

at scala.Predef$.require(Predef.scala:233)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:373)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:400)

at org.apache.spark.mllib.linalg.Matrices$.fromBreeze(Matrices.scala:701)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:321)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:310)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)

at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:64)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)


15/11/04 10:17:27 ERROR scheduler.TaskSetManager: Task 0 in stage 11.0
failed 1 times; aborting job

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Getting 4
non-empty blocks out of 4 blocks

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Started 0
remote fetches in 0 ms

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Getting 2
non-empty blocks out of 4 blocks

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Started 0
remote fetches in 1 ms

15/11/04 10:17:27 INFO scheduler.TaskSchedulerImpl: Cancelling stage 11

15/11/04 10:17:27 INFO executor.Executor: Executor is trying to kill task
1.0 in stage 11.0 (TID 31)

15/11/04 10:17:27 INFO scheduler.TaskSchedulerImpl: Stage 11 was cancelled

15/11/04 10:17:27 INFO executor.Executor: Executor is trying to kill task
2.0 in stage 11.0 (TID 32)

15/11/04 10:17:27 INFO scheduler.DAGScheduler: Stage 11 (map at
kmv.scala:26) failed in 0.114 s

15/11/04 10:17:27 INFO executor.Executor: Executor killed task 2.0 in stage
11.0 (TID 32)

15/11/04 10:17:27 INFO scheduler.DAGScheduler: Job 2 failed: reduce at
CoordinateMatrix.scala:143, took 6.046350 s

Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 0 in stage 11.0 failed 1 times, most recent failure:
Lost task 0.0 in stage 11.0 (TID 30, localhost):
java.lang.IllegalArgumentException: requirement failed: The last value of
colPtrs must equal the number of elements. values.length: 9164,
colPtrs.last: 5118

at scala.Predef$.require(Predef.scala:233)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:373)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:400)

at org.apache.spark.mllib.linalg.Matrices$.fromBreeze(Matrices.scala:701)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:321)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:310)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)

at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:64)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)


Driver stacktrace:

at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)

at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)

at scala.Option.foreach(Option.scala:236)

at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)

at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)

at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)

at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

15/11/04 10:17:27 WARN scheduler.TaskSetManager: Lost task 2.0 in stage
11.0 (TID 32, localhost): TaskKilled (killed intentionally)

15/11/04 10:17:27 INFO executor.Executor: Executor killed task 1.0 in stage
11.0 (TID 31)

15/11/04 10:17:27 WARN scheduler.TaskSetManager: Lost task 1.0 in stage
11.0 (TID 31, localhost): TaskKilled (killed intentionally)

15/11/04 10:17:27 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 11.0,
whose tasks have all completed, from pool

Thanks for any help!
-K

Reply via email to