Github user vasia commented on a diff in the pull request: https://github.com/apache/flink/pull/2984#discussion_r91908769 --- Diff: docs/dev/libs/gelly/bipartite_graph.md --- @@ -0,0 +1,148 @@ +--- +title: Graph Generators +nav-parent_id: graphs +nav-pos: 6 +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +* This will be replaced by the TOC +{:toc} + +Bipartite Graph +--------------- + +A bipartite graph (also called a two-mode graph) is a type of graph where vertices are separated into two disjoint sets. These sets are usually called top and bottom vertices. A single edge in this graph can only connect vertices from opposite sets (i.e. bottom vertex to top vertex) and cannot connect to vertices in the same set. + +Theses graphs have wide application in practice and can be a more natural choice for particular domains. For example to represent authorship of scientific papers top vertices can represent scientific papers while bottom nodes will represent authors. Naturally a node between a top and a bottom nodes would represent an authorship of a particular scientific paper. Another common example for applications of bipartite graphs is a relationships between actors and movies. In this case an edge represents that a particular actor played in a movie. + +Bipartite graph are used instead of regular graphs (one-mode) for the following practical [reasons](http://www.complexnetworks.fr/wp-content/uploads/2011/01/socnet07.pdf): + * They preserve more information about a connection between vertices. For example instead of a single link between two researchers in a graph that represents that they authored a paper together a bipartite graph preserve the information about what papers they authored + * Bipartite graph can encode the same information more compactly than one-mode graphs + + + +Graph Representation +-------------------- + +A `BipartiteGraph` is represented by: + * `DataSet` of top nodes + * `DataSet` of bottom nodes + * `DataSet` of edges between top and bottom nodes + +As in the `Graph` class nodes are represented by the `Vertex` type and the same rules applies to its types and values. + +The graph edges are represented by the `BipartiteEdge` type. An `BipartiteEdge` is defined by a top ID (the ID of the top `Vertex`), a bottom ID (the ID of the bottom `Vertex`) and an optional value. The main difference between the `Edge` and `BipartiteEdge` is that IDs of nodes it links can be of different types. Edges with no value have a `NullValue` value type. + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +BipartiteEdge<Long, String, Double> e = new BipartiteEdge<Long, String, Double>(1L, "id1", 0.5); + +Double weight = e.getValue(); // weight = 0.5 +{% endhighlight %} +</div> + +<div data-lang="scala" markdown="1"> +{% highlight scala %} +// TODO: Should be added when Scala interface is implemented +{% endhighlight %} +</div> +</div> +{% top %} + + +Graph Creation +-------------- + +You can create a `BipartiteGraph` in the following ways: + +* from a `DataSet` of top vertices, a `DataSet` of bottom vertices and a `DataSet` of edges: + +<div class="codetabs" markdown="1"> +<div data-lang="java" markdown="1"> +{% highlight java %} +ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment(); + +DataSet<Vertex<String, Long>> topVertices = ... + +DataSet<Vertex<String, Long>> bottomVertices = ... + +DataSet<Edge<String, String, Double>> edges = ... + +Graph<String, String, Long, Long, Double> graph = BipartiteGraph.fromDataSet(topVertices, bottomVertices, edges, env); +{% endhighlight %} +</div> + +<div data-lang="scala" markdown="1"> +{% highlight scala %} +// TODO: Should be added when Scala interface is implemented +{% endhighlight %} +</div> +</div> + + +Graph Transformations +--------------------- + + +* <strong>Projection</strong>: Projection is a common operation for bipartite graphs that converts a bipartite graph into a regular graph. There are two types of projections: top and bottom projections. Top projection preserves only top nodes in the result graph and create a link between them in a new graph only if there is an intermediate bottom node both top nodes connect to in the original graph. Bottom projection is the opposite to top projection, i.e. only preserves bottom nodes and connects a pair of node if they are connected in the original graph. --- End diff -- Can you add a figure to illustrate a top and a bottom projection?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---