[
https://issues.apache.org/jira/browse/GIRAPH-141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Presta updated GIRAPH-141:
-------------------------------------
Attachment: GIRAPH-141.patch
This patch introduces support for multigraphs in Giraph, by:
- changing Vertex#initialize() and Vertex#setEdges() to take a collection of
edges instead of a map
- adding two vertex implementations (MultiGraphEdgeListVertex and
MultiGraphRepresentativeVertex) where addEdge() only appends the edge, instead
of checking for duplicates
This is not only useful for representing multigraphs, but also for implementing
addEdge() efficiently (critical for edge-based input when the out-degree can be
really large).
Other minor changes:
- changed MutableVertex#addEdge() to get an Edge instead of an id and value,
mainly for consistency with other code
- added PseudoRandomEdgeInputFormat and extended PageRankBenchmark to accept
edge-based input
- various fixes on the way
I ran several benchmarks to compare this patch with the current version, and
edge-based input with vertex-based input:
Graph: 10M vertices, 1B edges, 10 workers
EdgeListVertex + VertexInputFormat, current version:
Total (milliseconds) 118,377
Input superstep (milliseconds) 44,041
Superstep 0 (milliseconds) 20,970
Superstep 1 (milliseconds) 23,731
Superstep 2 (milliseconds) 24,269
EdgeListVertex + VertexInputFormat, patched version:
Total (milliseconds) 116,298
Input superstep (milliseconds) 40,441
Superstep 0 (milliseconds) 22,444
Superstep 1 (milliseconds) 24,164
Superstep 2 (milliseconds) 25,036
MultiGraphEdgeListVertex + EdgeInputFormat, patched version:
Total (milliseconds) 148,905
Input superstep (milliseconds) 72,425
Superstep 0 (milliseconds) 23,684
Superstep 1 (milliseconds) 25,580
Superstep 2 (milliseconds) 22,456
RepresentativeVertex + VertexInputFormat, current version:
Total (milliseconds) 111,450
Input superstep (milliseconds) 39,301
Superstep 0 (milliseconds) 21,615
Superstep 1 (milliseconds) 22,213
Superstep 2 (milliseconds) 21,840
RepresentativeVertex + VertexInputFormat, patched version:
Total (milliseconds) 106,512
Input superstep (milliseconds) 38,142
Superstep 0 (milliseconds) 20,812
Superstep 1 (milliseconds) 22,906
Superstep 2 (milliseconds) 21,250
MultiGraphRepresentativeVertex + EdgeInputFormat, patched version:
Total (milliseconds) 143,831
Input superstep (milliseconds) 75,030
Superstep 0 (milliseconds) 20,661
Superstep 1 (milliseconds) 21,406
Superstep 2 (milliseconds) 22,456
> multigraph support in giraph
> ----------------------------
>
> Key: GIRAPH-141
> URL: https://issues.apache.org/jira/browse/GIRAPH-141
> Project: Giraph
> Issue Type: Improvement
> Components: graph
> Reporter: André Kelpe
> Assignee: Alessandro Presta
> Attachments: GIRAPH-141.patch
>
>
> The current vertex API only supports simple graphs, meaning that there can
> only ever be one edge between two vertices. Many graphs like the road network
> are in fact multigraphs, where many edges can connect two vertices at the
> same time.
> Support for this could be added by introducing an Iterator<EdgeWritable>
> getEdgeValue() or a similar construct. Maybe introducing a slim object like a
> Connector between the edge and the vertex is also a good idea, so that you
> could do something like:
> {code}
> for (final Connector<EdgeWritable, VertexWritable> conn: getEdgeValues(){
> final EdgeWritable edge = conn.getEdge();
> final VertexWritable otherVertex = conn.getOther();
> doInterestingStuff(otherVertex);
> doMoreInterestingStuff(edge);
> }
> {code}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira