Hello,
I've got some problem about Hama program on large dataset.My hama cluster
has 3 computers.One is bspmaster,other 2 computers is groomserver.When the
graph has one million nodes,the pagerank running will stop at "Current
Superstep Number:0".The error information is as follows:
attempt_201401141543_0015_000000_0: 14/01/14 17:07:55 ERROR bsp.BSPTask:
Shutting down ping service.
attempt_201401141543_0015_000000_0: 14/01/14 17:07:55 FATAL bsp.GroomServer:
Error running child
attempt_201401141543_0015_000000_0: java.lang.OutOfMemoryError: Java heap space
attempt_201401141543_0015_000000_0: at
java.util.Arrays.copyOf(Arrays.java:2271)
attempt_201401141543_0015_000000_0: at
java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
attempt_201401141543_0015_000000_0: at
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
attempt_201401141543_0015_000000_0: at
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
attempt_201401141543_0015_000000_0: at
java.io.DataOutputStream.write(DataOutputStream.java:107)
Should I add the number of computers of cluster to solve this problem?Any
other solution?
And I also want to know how many peers a groomserver can run?Is a peer
correspond to a node in graph mode?If so,I have one million of nodes in a graph
and if one groomserver run 500 peers,so I need 2000 groomservers correspond to
2000 computers.In my opinion,it's not convenient.
By the way,what's a appropriate value of "bsp.tasks.maxinum" in Hama conf
?Is the number of computer of cluster a appropriate value?
Thanks!
Best wishes.