Hi,

> Anyone try out the PageRank function on Gremlin?
> 
> https://github.com/tinkerpop/gremlin/wiki/Working-with-JUNG-Algorithms/0506c193f30abe0bc18d40d7a08c9257d9311b13
> 
> How does it perform with just under 100k nodes on a sparse graph (3000
> relationship max, average of 100)?
> 
> I've been doing my pagerank via the power method in rb-gsl and while it's
> fine for around 10k items,  it's sucking all the memory on my server when
> trying to do 92k items.

Blueprints ( https://github.com/tinkerpop/blueprints/wiki/JUNG-Ouplementation ) 
implements the JUNG graph interface and thus, makes any Blueprints-enabled 
graph database into a JUNG graph. Unfortunately, JUNG was engineered from the 
perspective of in-memory use. As such, you will be running into memory issues 
on very large graphs. For example, if you have a 1million+ vertex graph and you 
are running PageRank on it, then your eigenvector vector is 1million+ entries. 
JUNG isn't serializing this vector to disk for you---its doing it all in 
memory. And if you don't have the memory to support a 1million+ vector (i.e. 
Map<Vertex,Double>), then, well.... 

So, in short, be wary of doing memory intensive algorithms with JUNG (i.e. 
understand the intermediate data structures generated from the various 
supported graph algorithms). For non-memory intensive algorithms like shortest 
path, it should meet your needs. Into the future, TinkerPop will be filling out 
Furnace (http://furnace.tinkerpop.com) and this package will provide memory 
conscious implementations of classic and non-classical graph algorithms.

HTH,
Marko.

http://markorodriguez.com
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to