Fantastic! Thanks a lot!

Sent from mobile device. Please excuse typos-terseness.

> On Jul 14, 2014, at 4:42 AM, Tiago de Paula Peixoto <[email protected]> wrote:
> 
>> On 07/14/2014 07:11 AM, Helen Lampesis wrote:
>> Dear graphErs,
>> 
>> I am about to start a project working on a graph with about 50M Nodes
>> and 1B edges and I want your opinion regarding the feasibility of this
>> endeavor with graph_tool.
>> 
>> Can you please share your experience with graph examples that were
>> comparable in size?
>> 
>> I mostly want to calculate centrality measures and I will need to
>> apply (several) filters to isolate nodes with particular attributes.
>> 
>> On top of it, I am running on a super-computer (so memory is NOT an
>> issue) and if I am lucky they have installed/enabled the parallel
>> version of the library.
> 
> You should be able to tackle graphs of this size, if you have enough
> memory. For centrality calculations, graph-tool has pure-C++
> parallel code, so you should see some good performance.
> 
> Graph filtering can also be done without involving python loops, so it
> should scale well as well.
> 
> Just as an illustration, for the graph size you suggested:
> 
>    In [1]: g = random_graph(50000000, lambda: poisson(40), random=False, 
> directed=False)
>    In [2]: %time pagerank(g)
>    CPU times: user 3min 26s, sys: 44 s, total: 4min 10s
>    Wall time: 11.3 s
> 
> So, pagerank takes about 11 seconds on a machine with 32 cores (it would
> have taken around 3-4 minutes in a single thread). And it takes about 50
> GB of ram to store the graph.
> 
> Best,
> Tiago
> 
> -- 
> Tiago de Paula Peixoto <[email protected]>
> 
> _______________________________________________
> graph-tool mailing list
> [email protected]
> http://lists.skewed.de/mailman/listinfo/graph-tool

_______________________________________________
graph-tool mailing list
[email protected]
http://lists.skewed.de/mailman/listinfo/graph-tool

Reply via email to