Hi,

I'm attempting to use the distributed matrix data structure BlockMatrix
(Spark 1.5.0, scala) and having some issues when attempting to add two block
matrices together (error attached below).

I'm constructing the two matrices by creating a collection of MatrixEntry's,
putting that into CoordinateMatrix (with specifying Nrows,Ncols), and then
using the CoordinateMatrix routine toBlockMatrix(Rpb,Cpb).  For both
matrices the Rpb/Cpb's are the same.

Unfortunately when attempting to use the BlockMatrix.add routine I'm
getting:


15/11/04 10:17:27 ERROR executor.Executor: Exception in task 0.0 in stage
11.0 (TID 30)

java.lang.IllegalArgumentException: requirement failed: The last value of
colPtrs must equal the number of elements. values.length: 9164,
colPtrs.last: 5118

at scala.Predef$.require(Predef.scala:233)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:373)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:400)

at org.apache.spark.mllib.linalg.Matrices$.fromBreeze(Matrices.scala:701)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:321)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:310)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)

at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:64)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

15/11/04 10:17:27 INFO scheduler.TaskSetManager: Starting task 2.0 in stage
11.0 (TID 32, localhost, PROCESS_LOCAL, 2171 bytes)

15/11/04 10:17:27 INFO executor.Executor: Running task 2.0 in stage 11.0
(TID 32)

15/11/04 10:17:27 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 11.0
(TID 30, localhost): java.lang.IllegalArgumentException: requirement failed:
The last value of colPtrs must equal the number of elements. values.length:
9164, colPtrs.last: 5118

at scala.Predef$.require(Predef.scala:233)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:373)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:400)

at org.apache.spark.mllib.linalg.Matrices$.fromBreeze(Matrices.scala:701)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:321)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:310)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)

at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:64)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)



15/11/04 10:17:27 ERROR scheduler.TaskSetManager: Task 0 in stage 11.0
failed 1 times; aborting job

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Getting 4
non-empty blocks out of 4 blocks

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote
fetches in 0 ms

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Getting 2
non-empty blocks out of 4 blocks

15/11/04 10:17:27 INFO storage.ShuffleBlockFetcherIterator: Started 0 remote
fetches in 1 ms

15/11/04 10:17:27 INFO scheduler.TaskSchedulerImpl: Cancelling stage 11

15/11/04 10:17:27 INFO executor.Executor: Executor is trying to kill task
1.0 in stage 11.0 (TID 31)

15/11/04 10:17:27 INFO scheduler.TaskSchedulerImpl: Stage 11 was cancelled

15/11/04 10:17:27 INFO executor.Executor: Executor is trying to kill task
2.0 in stage 11.0 (TID 32)

15/11/04 10:17:27 INFO scheduler.DAGScheduler: Stage 11 (map at
kmv.scala:26) failed in 0.114 s

15/11/04 10:17:27 INFO executor.Executor: Executor killed task 2.0 in stage
11.0 (TID 32)

15/11/04 10:17:27 INFO scheduler.DAGScheduler: Job 2 failed: reduce at
CoordinateMatrix.scala:143, took 6.046350 s

Exception in thread "main" org.apache.spark.SparkException: Job aborted due
to stage failure: Task 0 in stage 11.0 failed 1 times, most recent failure:
Lost task 0.0 in stage 11.0 (TID 30, localhost):
java.lang.IllegalArgumentException: requirement failed: The last value of
colPtrs must equal the number of elements. values.length: 9164,
colPtrs.last: 5118

at scala.Predef$.require(Predef.scala:233)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:373)

at org.apache.spark.mllib.linalg.SparseMatrix.<init>(Matrices.scala:400)

at org.apache.spark.mllib.linalg.Matrices$.fromBreeze(Matrices.scala:701)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:321)

at
org.apache.spark.mllib.linalg.distributed.BlockMatrix$$anonfun$5.apply(BlockMatrix.scala:310)

at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)

at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)

at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)

at
org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:202)

at
org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:56)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)

at
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)

at org.apache.spark.scheduler.Task.run(Task.scala:64)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)



Driver stacktrace:

at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1204)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1193)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1192)

at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)

at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)

at
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1192)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)

at
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:693)

at scala.Option.foreach(Option.scala:236)

at
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:693)

at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1393)

at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1354)

at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)

15/11/04 10:17:27 WARN scheduler.TaskSetManager: Lost task 2.0 in stage 11.0
(TID 32, localhost): TaskKilled (killed intentionally)

15/11/04 10:17:27 INFO executor.Executor: Executor killed task 1.0 in stage
11.0 (TID 31)

15/11/04 10:17:27 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 11.0
(TID 31, localhost): TaskKilled (killed intentionally)

15/11/04 10:17:27 INFO scheduler.TaskSchedulerImpl: Removed TaskSet 11.0,
whose tasks have all completed, from pool 


Thanks for any help!




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Problem-using-BlockMatrix-add-tp25273.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to