Good to know :) On 25 November 2015 at 21:44, Stefanos Antaris <antaris.stefa...@gmail.com> wrote:
> Hi, > > It works fine using this approach. > > Thanks, > Stefanos > > On 25 Nov 2015, at 20:32, Vasiliki Kalavri <vasilikikala...@gmail.com> > wrote: > > Hey, > > you can preprocess your data, create the vertices and store them to a > file, like you would store any other Flink DataSet, e.g. with writeAsText. > > Then, you can create the graph by reading 2 datasets, like this: > > DataSet<Vertex> vertices = env.readTextFile("/path/to/vertices/")... // or > your custom reading logic > DataSet<Edge> edges = ... > > Graph graph = Graph.fromDataSet(vertices, edges, env); > > Is this what you're looking for? > > Also, note that if you have a very large graph, you should avoid using > collect() and fromCollection(). > > -Vasia. > > On 25 November 2015 at 18:03, Stefanos Antaris <antaris.stefa...@gmail.com > > wrote: > >> Hi Vasia, >> >> my graph object is the following: >> >> Graph<MyPojoNode, NullValue, Integer> graph = Graph.fromCollection( >> edgeList.collect(), env); >> >> The vertex is a POJO not the value. So the problem is how could i store >> and retrieve the vertex list? >> >> Thanks, >> Stefanos >> >> On 25 Nov 2015, at 18:16, Vasiliki Kalavri <vasilikikala...@gmail.com> >> wrote: >> >> Hi Stefane, >> >> let me know if I understand the problem correctly. The vertex values are >> POJOs that you're somehow inferring from the edge list and this value >> creation is what takes a lot of time? Since a graph is just a set of 2 >> datasets (vertices and edges), you could store the values to disk and have >> a custom input format to read them into datasets. Would that work for you? >> >> -Vasia. >> >> On 25 November 2015 at 15:09, Stefanos Antaris < >> antaris.stefa...@gmail.com> wrote: >> >>> Hi to all, >>> >>> i am working on a project with Gelly and i need to create a graph with >>> billions of nodes. Although i have the edge list, the node in the Graph >>> needs to be a POJO object, the construction of which takes long time in >>> order to finally create the final graph. Is it possible to store the Graph >>> object as a file and retrieve it whenever i want to run an experiment? >>> >>> Thanks, >>> Stefanos >> >> >> >> > >