[ https://issues.apache.org/jira/browse/GIRAPH-141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alessandro Presta updated GIRAPH-141: ------------------------------------- Attachment: GIRAPH-141.patch This patch introduces support for multigraphs in Giraph, by: - changing Vertex#initialize() and Vertex#setEdges() to take a collection of edges instead of a map - adding two vertex implementations (MultiGraphEdgeListVertex and MultiGraphRepresentativeVertex) where addEdge() only appends the edge, instead of checking for duplicates This is not only useful for representing multigraphs, but also for implementing addEdge() efficiently (critical for edge-based input when the out-degree can be really large). Other minor changes: - changed MutableVertex#addEdge() to get an Edge instead of an id and value, mainly for consistency with other code - added PseudoRandomEdgeInputFormat and extended PageRankBenchmark to accept edge-based input - various fixes on the way I ran several benchmarks to compare this patch with the current version, and edge-based input with vertex-based input: Graph: 10M vertices, 1B edges, 10 workers EdgeListVertex + VertexInputFormat, current version: Total (milliseconds) 118,377 Input superstep (milliseconds) 44,041 Superstep 0 (milliseconds) 20,970 Superstep 1 (milliseconds) 23,731 Superstep 2 (milliseconds) 24,269 EdgeListVertex + VertexInputFormat, patched version: Total (milliseconds) 116,298 Input superstep (milliseconds) 40,441 Superstep 0 (milliseconds) 22,444 Superstep 1 (milliseconds) 24,164 Superstep 2 (milliseconds) 25,036 MultiGraphEdgeListVertex + EdgeInputFormat, patched version: Total (milliseconds) 148,905 Input superstep (milliseconds) 72,425 Superstep 0 (milliseconds) 23,684 Superstep 1 (milliseconds) 25,580 Superstep 2 (milliseconds) 22,456 RepresentativeVertex + VertexInputFormat, current version: Total (milliseconds) 111,450 Input superstep (milliseconds) 39,301 Superstep 0 (milliseconds) 21,615 Superstep 1 (milliseconds) 22,213 Superstep 2 (milliseconds) 21,840 RepresentativeVertex + VertexInputFormat, patched version: Total (milliseconds) 106,512 Input superstep (milliseconds) 38,142 Superstep 0 (milliseconds) 20,812 Superstep 1 (milliseconds) 22,906 Superstep 2 (milliseconds) 21,250 MultiGraphRepresentativeVertex + EdgeInputFormat, patched version: Total (milliseconds) 143,831 Input superstep (milliseconds) 75,030 Superstep 0 (milliseconds) 20,661 Superstep 1 (milliseconds) 21,406 Superstep 2 (milliseconds) 22,456 > multigraph support in giraph > ---------------------------- > > Key: GIRAPH-141 > URL: https://issues.apache.org/jira/browse/GIRAPH-141 > Project: Giraph > Issue Type: Improvement > Components: graph > Reporter: André Kelpe > Assignee: Alessandro Presta > Attachments: GIRAPH-141.patch > > > The current vertex API only supports simple graphs, meaning that there can > only ever be one edge between two vertices. Many graphs like the road network > are in fact multigraphs, where many edges can connect two vertices at the > same time. > Support for this could be added by introducing an Iterator<EdgeWritable> > getEdgeValue() or a similar construct. Maybe introducing a slim object like a > Connector between the edge and the vertex is also a good idea, so that you > could do something like: > {code} > for (final Connector<EdgeWritable, VertexWritable> conn: getEdgeValues(){ > final EdgeWritable edge = conn.getEdge(); > final VertexWritable otherVertex = conn.getOther(); > doInterestingStuff(otherVertex); > doMoreInterestingStuff(edge); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira