Hi Sabarish, Works fine for me with less than those settings (30x1000 dense matrix, 1GB driver, 1GB executor):
bin/spark-shell --driver-memory 1G --executor-memory 1G Then running the following finished without trouble and in a few seconds. Are you sure your driver is actually getting the RAM you think you gave it? // Create 30x1000 matrix val rows = sc.parallelize(1 to 30, 4).map { line => val values = Array.tabulate(1000)(x=>scala.math.random) Vectors.dense(values) }.cache() val mat = new RowMatrix(rows) // Compute similar columns perfectly, with brute force. val exact = mat.columnSimilarities().entries.map(x => x.value).sum() On Sun, Mar 1, 2015 at 3:31 PM, Sabarish Sasidharan < sabarish.sasidha...@manthan.com> wrote: > I am trying to compute column similarities on a 30x1000 RowMatrix of > DenseVectors. The size of the input RDD is 3.1MB and its all in one > partition. I am running on a single node of 15G and giving the driver 1G > and the executor 9G. This is on a single node hadoop. In the first attempt > the BlockManager doesn't respond within the heart beat interval. In the > second attempt I am seeing a GC overhead limit exceeded error. And it is > almost always in the RowMatrix.columSimilaritiesDIMSUM -> > mapPartitionsWithIndex (line 570) > > java.lang.OutOfMemoryError: GC overhead limit exceeded > at > org.apache.spark.mllib.linalg.distributed.RowMatrix$$anonfun$19$$anonfun$apply$2.apply(RowMatrix.scala:570) > at > org.apache.spark.mllib.linalg.distributed.RowMatrix$$anonfun$19$$anonfun$apply$2.apply(RowMatrix.scala:528) > at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371) > > > It also really seems to be running out of memory. I am seeing the > following in the attempt log > Heap > PSYoungGen total 2752512K, used 2359296K > eden space 2359296K, 100% used > from space 393216K, 0% used > to space 393216K, 0% used > ParOldGen total 6291456K, used 6291376K [0x0000000580000000, > 0x0000000700000000, 0x0000000700000000) > object space 6291456K, 99% used > Metaspace used 39225K, capacity 39558K, committed 39904K, reserved > 1083392K > class space used 5736K, capacity 5794K, committed 5888K, reserved > 1048576K > > What could be going wrong? > > Regards > Sab >