I am still debugging it but I believe if m% of users have unusually large
columns and the RDD partitioner on RowMatrix is hashPartitioner then due to
the basic algorithm without sampling, some partitions can cause unusually
large number of keys...
If my debug shows that I will add a custom
It may be caused by GC pause. Did you check the GC time in the Spark
UI? -Xiangrui
On Sun, Feb 15, 2015 at 8:10 PM, Debasish Das debasish.da...@gmail.com wrote:
Hi,
I am sometimes getting WARN from running Similarity calculation:
15/02/15 23:07:55 WARN BlockManagerMasterActor: Removing
Hi,
I am sometimes getting WARN from running Similarity calculation:
15/02/15 23:07:55 WARN BlockManagerMasterActor: Removing BlockManager
BlockManagerId(7, abc.com, 48419, 0) with no recent heart beats: 66435ms
exceeds 45000ms
Do I need to increase the default 45 s to larger values for cases