Hi Suijian,

Giraph has several PageRank implementations. I suggest that you use org.apache.giraph.examples.PageRankComputation which will automatically check convergence for you and correctly handle dangling vertices (vertices without any outlinks).

It relies on org.apache.giraph.examples.LongDoubleNullTextInputFormat which expects a very simple text file. The format is one line per vertex with the id of the vertex followed by the ids of adjacent vertices:

src_vertex_id dest_vertex_id_1 dest_vertex_id_2 ...

See org.apache.giraph.examples.PageRankComputationTest for an example of how to configure it.

It needs org.apache.giraph.examples.RandomWalkWorkerContext as worker context and org.apache.giraph.examples.RandomWalkVertexMasterCompute as master compute.

Best,
Sebastian




On 02/26/2014 09:09 PM, Suijian Zhou wrote:
Hi,
   To load and compute the pagerank of the following graph format(common in
social network graphs):

Src_vertex_id_1 Dest_vertex_id_2 Dest_vertex_id_3 (v1->v2, v1->v3)
Src_vertex_id_2 Dest_vertex_id_4 Dest_vertex_id_5 Dest_vertex_id_6 (v2->v4,
v2->v5, v2->v6)
.....

Should I have to convert the above input format into the following so as to
be compatible with giraph?

[Src_vertex1_id_1, 1, [[Dest_vertex_id_2,0],[Dest_vertex_id_3,0]]]
[Src_vertex1_id_2, 1,
[[Dest_vertex_id_4,0],[Dest_vertex_id_5,0],[Dest_vertex_id_6,0]]]
......

I.e, to set initial vertex values to 1 and edge values to 0? Thanks!

   Best Regards,
   Suijian


Reply via email to