Hi Kaan,

   For the first issue, I think the two implementation should have difference 
and the first should be slower, but I think which one to use should be depend 
on your algorithm if it could compute incrementally only with the changed 
edges. However, as far as I know I think most graph algorithm does not satisfy 
this property, therefore I think you might have to use the first one. 

    For the second issue, I think you might use Graph.getVertices() and 
graph.getEdges() to get the underlying vertices and edges dataset of the graph, 
then you could do any operations with the two datasets, like join the vertices 
dataset with the second edge list, and finally create a new Graph with new 
Graph(updated_vertices, edges, env).

Best,
 Yun


------------------------------------------------------------------
From:Kaan Sancak <kaans...@gmail.com>
Send Time:2020 Apr. 16 (Thu.) 17:17
To:Till Rohrmann <trohrm...@apache.org>
Cc:Tzu-Li (Gordon) Tai <tzuli...@apache.org>; user <user@flink.apache.org>
Subject:Re: Question about Writing Incremental Graph Algorithms using Apache 
Flink Gelly

If the vertex type is POJO what happens during the union of the graph? Is there 
a persistent approach, or can we define a function handle such occasions?

Would there be a performance difference between two cases:

1)
 Graph graph = … // From edges list

 graph = graph.runScatterGatherIteration();

 Graph secondGraph = … // From second edge list

 graph = graph.union(secondGraph).runScatterGatherIteration()

2)
  
Graph graph = … // From edges list

 graph = graph.runScatterGatherIteration();

 graph.addEdges(second_edge_list)

 graph = graph.runScatterGatherIteration();


Before starting the second scatter-gather, I want to set/reset some fields of 
the vertex value of the vertices that are effected by edge additions/deletions 
(or union). It would be good to have a callback function that touches the 
end-points of the edges that are added/deleted.

Best
Kaan




On Apr 15, 2020, at 11:07 AM, Till Rohrmann <trohrm...@apache.org> wrote:
Hi Kaan,

I think what you are proposing is something like this:

Graph<Long, Double, Double> graph = ... // get first batch

Graph<Long, Double, Double> graphAfterFirstSG = 
graph.runScatterGatherIteration();

Graph<Long, Double, Double> secondBatch = ... // get second batch

// Adjust the result of SG iteration with secondBatch

Graph<Long, Double, Double> updatedGraph = 
graphAfterFirstSG.union/difference(secondBatch));

updatedGraph.runScatterGatherIteration();

Then I believe this should work.

Cheers,
Till
On Wed, Apr 15, 2020 at 1:14 AM Kaan Sancak <kaans...@gmail.com> wrote:
Thanks for the useful information! It seems like a good and fun idea to 
experiment. I will definitely give it a try.

I have a very close upcoming deadline and I have already implemented the 
Scatter-Gather iteration algorithm.

I have another question on whether we can chain Scatter-Gather or 
Vertex-Centric iterations.
Let’s say that we have an initial batch/dataset, we run a Scatter-Gather and 
obtain graph.
Using another batch we added/deleted vertices to the graph we obtained. 
Now we run another Scatter-Gather on the modified graph.

This is no streaming but a naive way to simulate batch updates that are 
happening concurrently.
Do you think it is a feasible way to do this way? 

Best
Kaan

On Apr 13, 2020, at 11:16 PM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote:
Hi,

As you mentioned, Gelly Graph's are backed by Flink DataSets, and therefore
work primarily on static graphs. I don't think it'll be possible to
implement incremental algorithms described in your SO question.

Have you tried looking at Stateful Functions, a recent new API added to
Flink?
It supports arbitrary messaging between functions, which may allow you to
build what you have in mind.
Take a look at Seth's an Igal's comments here [1], where there seems to be a
similar incremental graph-processing use case for sessionization.

Cheers,
Gordon

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Complex-graph-based-sessionization-potential-use-for-stateful-functions-td34000.html#a34017



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Reply via email to