Thanks that worked! I wonder what will be the performance difference if I implement this with Stateful Functions. Does anyone knows recent works/papers on similar approach?
Best Kaan > On 16 Apr 2020, at 10:00, Yun Gao <yungao...@aliyun.com> wrote: > > > Hi Kaan, > > For the first issue, I think the two implementation should have difference > and the first should be slower, but I think which one to use should be depend > on your algorithm if it could compute incrementally only with the changed > edges. However, as far as I know I think most graph algorithm does not > satisfy this property, therefore I think you might have to use the first one. > > For the second issue, I think you might use Graph.getVertices() and > graph.getEdges() to get the underlying vertices and edges dataset of the > graph, then you could do any operations with the two datasets, like join the > vertices dataset with the second edge list, and finally create a new Graph > with new Graph(updated_vertices, edges, env). > > Best, > Yun > > ------------------------------------------------------------------ > From:Kaan Sancak <kaans...@gmail.com> > Send Time:2020 Apr. 16 (Thu.) 17:17 > To:Till Rohrmann <trohrm...@apache.org> > Cc:Tzu-Li (Gordon) Tai <tzuli...@apache.org>; user <user@flink.apache.org> > Subject:Re: Question about Writing Incremental Graph Algorithms using Apache > Flink Gelly > > If the vertex type is POJO what happens during the union of the graph? Is > there a persistent approach, or can we define a function handle such > occasions? > > Would there be a performance difference between two cases: > > 1) > Graph graph = … // From edges list > > graph = graph.runScatterGatherIteration(); > > Graph secondGraph = … // From second edge list > > graph = graph.union(secondGraph).runScatterGatherIteration() > > 2) > > Graph graph = … // From edges list > > graph = graph.runScatterGatherIteration(); > > graph.addEdges(second_edge_list) > > graph = graph.runScatterGatherIteration(); > > > Before starting the second scatter-gather, I want to set/reset some fields of > the vertex value of the vertices that are effected by edge > additions/deletions (or union). It would be good to have a callback function > that touches the end-points of the edges that are added/deleted. > > Best > Kaan > > > > On Apr 15, 2020, at 11:07 AM, Till Rohrmann <trohrm...@apache.org> wrote: > > Hi Kaan, > > I think what you are proposing is something like this: > > Graph<Long, Double, Double> graph = ... // get first batch > > Graph<Long, Double, Double> graphAfterFirstSG = > graph.runScatterGatherIteration(); > > Graph<Long, Double, Double> secondBatch = ... // get second batch > > // Adjust the result of SG iteration with secondBatch > > Graph<Long, Double, Double> updatedGraph = > graphAfterFirstSG.union/difference(secondBatch)); > > updatedGraph.runScatterGatherIteration(); > > Then I believe this should work. > > Cheers, > Till > > On Wed, Apr 15, 2020 at 1:14 AM Kaan Sancak <kaans...@gmail.com> wrote: > Thanks for the useful information! It seems like a good and fun idea to > experiment. I will definitely give it a try. > > I have a very close upcoming deadline and I have already implemented the > Scatter-Gather iteration algorithm. > > I have another question on whether we can chain Scatter-Gather or > Vertex-Centric iterations. > Let’s say that we have an initial batch/dataset, we run a Scatter-Gather and > obtain graph. > Using another batch we added/deleted vertices to the graph we obtained. > Now we run another Scatter-Gather on the modified graph. > > This is no streaming but a naive way to simulate batch updates that are > happening concurrently. > Do you think it is a feasible way to do this way? > > Best > Kaan > > On Apr 13, 2020, at 11:16 PM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > > Hi, > > As you mentioned, Gelly Graph's are backed by Flink DataSets, and therefore > work primarily on static graphs. I don't think it'll be possible to > implement incremental algorithms described in your SO question. > > Have you tried looking at Stateful Functions, a recent new API added to > Flink? > It supports arbitrary messaging between functions, which may allow you to > build what you have in mind. > Take a look at Seth's an Igal's comments here [1], where there seems to be a > similar incremental graph-processing use case for sessionization. > > Cheers, > Gordon > > [1] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Complex-graph-based-sessionization-potential-use-for-stateful-functions-td34000.html#a34017 > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > >