Hello all,

We are about to finalise following design. Please review it in the google
doc below. And let us know your inputs.

https://docs.google.com/document/d/1SChonW4bXNiqU2Pz9PPjuKDuKrBGmESrPcJXamj4dlI/edit

-- Saumitra S. Shahapure

On Thu, Oct 22, 2015 at 1:00 PM, Saumitra Shahapure <
saumitra.offic...@gmail.com> wrote:

> Hi Vasia,
>
> Here
> <https://docs.google.com/document/d/1SChonW4bXNiqU2Pz9PPjuKDuKrBGmESrPcJXamj4dlI/edit?usp=sharing>
> is the Google doc.
>
> -- Saumitra S. Shahapure
>
> On Thu, Oct 22, 2015 at 11:10 AM, Vasiliki Kalavri <
> vasilikikala...@gmail.com> wrote:
>
>> Hi Saumitra,
>>
>> could you maybe add your ideas above in a google doc and share the link,
>> so
>> we can easily comment and/or edit?
>>
>> Thank you!
>> -Vasia.
>>
>> On 22 October 2015 at 10:53, Andra Lungu <lungu.an...@gmail.com> wrote:
>>
>> > Hi Saumitra,
>> >
>> > As you already noticed, the first version (with duplicates) is highly
>> > inefficient and consumes a lot of memory. So, I suggest we drop it for
>> now.
>> > The version with the label makes a lot of modifications on the base
>> Graph
>> > class, and this, in my opinion would make it more difficult to grasp.
>> > Bipartite Graphs are not as common as regular graphs. And then, when you
>> > would add an Isomorphic Graph class (stupid example), you'll need to
>> strip
>> > stuff out again.
>> >
>> > I believe we should go with a BipartiteGraph class which extends Graph
>> and
>> > which has a DataSet of topVertices and a DataSet of bottomVertices.
>> Apart
>> > from getTopVertices() methods, we would still need functions such as
>> > getNumberOfTopVertices(). For the time being, just support the
>> funstiones
>> > mentiones as well as the basics: fromCollection, fromDataSet, etc, get*,
>> > join*, map*, filter*.
>> >
>> > Wait for a few more opinions and dig in. I believe we should discuss
>> this
>> > even further and propose Jiras for examples of algos for bipartite
>> graphs.
>> >
>> > Once we have the design specs well figured out, and if I find some time
>> > I'll gladly chime in :)
>> >
>> > Cheers!
>> > Andra
>> >
>> > On Wed, Oct 21, 2015 at 8:45 PM, Saumitra Shahapure <
>> > saumitra.offic...@gmail.com> wrote:
>> >
>> > > In FLINK-2254 <https://issues.apache.org/jira/browse/FLINK-2254> , we
>> > > want to extend Graph API of Gelly by adding support for bipartite
>> graphs
>> > > too. In the long term, the support for newer graph types can be added
>> in
>> > > the same way. Also specialised algorithms for special types of graphs
>> > > should be implemented.
>> > >
>> > > For bipartite graph, a new class BipartiteGraph can be implemented
>> which
>> > > extends Graph. Other graph algorithms which are written for Graph can
>> be
>> > > used for BipartiteGraph too, because of inheritance.
>> > > Initialisation functions like  public static <K, VV, EV> Graph<K, VV,
>> EV>
>> > > fromCollection(Collection<Vertex<K, VV>> vertices,Collection<Edge<K,
>> EV>>
>> > > edges, ExecutionEnvironment context) need to be duplicated as public
>> > static
>> > > <K, VV, EV> BipartiteGraph<K, VV, EV>
>> fromCollection(Collection<Vertex<K,
>> > > VV>> topVertices,Collection<Vertex<K, VV>> bottomVertices,
>> > > Collection<Edge<K, EV>> edges, ExecutionEnvironment context)
>> > > Graph validation does not need to happen implicitly. user can call
>> > > BipartiteGraph.validate() in case he wants to check sanity of data.
>> > > Vertex modes is primary requirement. BipartiteGraph should have
>> > functions,
>> > > getTopVertices() and getBottomVertices(). There are three ways to
>> > implement
>> > > it:
>> > > Simplest and least efficient way is to maintain vertices variable in
>> > Graph
>> > > in addition to two more Datasets topVertices, bottomVertices. Benefit
>> of
>> > > this approach is that Graph class would not need any modification at
>> all.
>> > > this.vertices variable access inside Graph class would work correctly.
>> > > Disadvantage is that multiple copies of vertex dataset are created and
>> > > maintained. So this is inefficient in terms of memory.
>> > > Another way is to keep topVertices and bottomVertices variables in
>> > > BipartiteGraph. vertices variable in Graph would stay empty.
>> > getVertices()
>> > > function of Graph would be overloaded by BipartiteGraph and
>> reimplemented
>> > > as union of of topVertices and bottomVertices. Since, vertices is a
>> > private
>> > > variable, external view of vertices stays unaffected. All functions of
>> > > Graph class which use vertices local variable (e.g.
>> > getVerticesAsTuple2())
>> > > need to use getVertices() instead of this.vertices. Disadvantage of
>> this
>> > > method is Graph.vertices variable would have invalid value throughout
>> for
>> > > BipartiteGraph.
>> > > Another way is to ‘label’ vertices with an integer. So public class
>> > > BipartiteGraph<K,VV,EV> extends Graph<K, Tuple2<VV,Integer>, EV>
>> would be
>> > > the signature. Initialisers would tag the vertices according to their
>> > mode.
>> > > getVertices() should be overridden to strip-off the tag values so that
>> > > users would get consistent view of vertex dataset for Graph and
>> > > BipartiteGraph. getTopVertices/getBottomVertices would be filter
>> > functions
>> > > on vertex tags.
>> > > I personally feel method 2 to be most elegant.
>> > > Functions like addVertices(vertexList) are not relevant without
>> > specifying
>> > > whether they are top or bottom. A possibility could be to add them to
>> > > topVertices by default. And have overloaded function
>> > > addVertices(vertexList, mode) to add them to specific partition.
>> > > Specialised algorithms for Bipartite graphs can implement new
>> > > BipartiteGraphAlgorithm interface. It would be similar to
>> GraphAlgorithm.
>> > > In future, newer types of graphs can be similarly derived from Graph
>> and
>> > > type hierarchy of Graph can be created. Class hierarchy would be
>> better
>> > > here rather than interface hierarchy, because the functions which are
>> > > applicable to all derived classes can be implemented in the base
>> class.
>> > > As a part of future refactoring, Graph transformation functions should
>> > > rather be a new class implementing GraphAlgorithm rather than
>> function in
>> > > Graph class. This may make implementing complex graph types easier.
>> > > PS. Do I need to add it in wiki too? I can copy the contents to wiki
>> if
>> > > you say so,
>> > > -- Saumitra Shahapure
>> > >
>> > >
>> >
>>
>
>

Reply via email to