Have a look at IntNullReverseTextEdgeInputFormat <https://giraph.apache.org/apidocs/org/apache/giraph/io/formats/IntNullReverseTextEdgeInputFormat.html>. It automatically creates reverse edges, but it expects the file format
<source_id, target_id> on each line. If you need to convert it to use longs you can change the code pretty easily. Best, Matthew On Tue, Mar 10, 2015 at 5:37 AM, Young Han <young....@uwaterloo.ca> wrote: > The input is assumed to be the vertex followed by a set of *directed* > edges. So, in your example, leaving out E2 means that the final graph will > not have the directed edge from V2 to V1. To get an undirected edge, you > need a pair of directed edges. > > Internally, Giraph stores the out-edges of each vertex as an adjacency > list at that vertex. So, for example, your undirected graph becomes a > vertex object V1 with an adjacency list {V2} and a vertex object V2 with an > adjacency list {V1}. The directed graph would be a vertex V1 with adjacency > list {V2} and a vertex V2 with an empty adjacency list {}. > > There's no easy way for Giraph to infer that V2's adjacency list should > contain V1, because V2 does not track its in-edges. To get around this, you > can either (1) use an undirected input file with both pairs of edges > present; (2) have, in your algorithms, all vertices broadcast their ids to > their out-edge neighbours and then perform mutations to add the missing > edges in the first two superstep; or (3) modify the code in > org.apache.giraph.io.* (in giraph-core) to cache and add missing edges > (i.e., add a new "type" of input format). I'm fairly certain that there > doesn't already exist an "assume undirected graph" input reader, but I'm > not too familiar with the code paths and options there so I could be wrong. > > Young > > On Mon, Mar 9, 2015 at 11:54 PM, G.W. <gwindel...@gmail.com> wrote: > >> Hi Giraph Mailing List, >> >> I am writing about an undirected graph I am trying to move to Giraph. I >> have a question about the assumption Giraph makes when processing an input. >> >> Let V1 and V2, two vertices connected with a common edge. E1 defines an >> edge from V1 to V2. E2 defines an edge from V2 to V1. >> >> Simply put, these are defined in an input file as: >> V1, E1 >> V2, E2 >> >> This is working fine, I can process the graph accordingly. >> >> I was trying to see what would happen if I was to simplify the input >> file: >> V1, E1 >> V2 >> >> When would come the time that V2 is processed in a superstep, Giraph >> would not suggest E1 as an outgoing edge. My questions is why, knowing >> that E1 defines an edge from V1 to V2. The graph being undirected (although >> there is no provision for that in my Giraph computation), shouldn't Giraph >> assume that V2 is connected to V1? >> >> Down the road, the idea would be to streamline the input format, hence my >> question. >> >> Thanks! >> >> >> >