I am trying to compute column similarities on a 30x1000 RowMatrix of
DenseVectors. The size of the input RDD is 3.1MB and its all in one
partition. I am running on a single node of 15G and giving the driver 1G
and the executor 9G. This is on a single node hadoop. In the first attempt
the BlockManager doesn't respond within the heart beat interval. In the
second attempt I am seeing a GC overhead limit exceeded error. And it is
almost always in the RowMatrix.columSimilaritiesDIMSUM ->
mapPartitionsWithIndex (line 570)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at
org.apache.spark.mllib.linalg.distributed.RowMatrix$$anonfun$19$$anonfun$apply$2.apply(RowMatrix.scala:570)
at
org.apache.spark.mllib.linalg.distributed.RowMatrix$$anonfun$19$$anonfun$apply$2.apply(RowMatrix.scala:528)
at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
It also really seems to be running out of memory. I am seeing the following
in the attempt log
Heap
PSYoungGen total 2752512K, used 2359296K
eden space 2359296K, 100% used
from space 393216K, 0% used
to space 393216K, 0% used
ParOldGen total 6291456K, used 6291376K [0x0000000580000000,
0x0000000700000000, 0x0000000700000000)
object space 6291456K, 99% used
Metaspace used 39225K, capacity 39558K, committed 39904K, reserved
1083392K
class space used 5736K, capacity 5794K, committed 5888K, reserved
1048576K
What could be going wrong?
Regards
Sab