Hi I was able to run shortest path algorithm for some small input size, but after that I extended the input data to 1 GB, then, we are getting the “java heap out of memory” runtime exception.
We only know that the default mechanism to assign a vertex to the running Map Instance in Giraph is via hash partition of the vertex. But we do not know how such Giraph hash partitioning mechanism to be translated into how to prepare the large input data, so that V1 is assigned to Partition 1, and V2 is assigned to Partition 2. Such input data partition will have to be agreed by Giraph runtime, so that at runtime, when Partition 1 is assigned to Map 1, V1’s send-message result to V2, will have the message to be correctly routed to Map 2 that is assigned to process Partition 2. Can somebody tell us how to prepare the input data (at least > 64 MB, the default partition size), partitions? Regards Arun