Hi, > Anyone try out the PageRank function on Gremlin? > > https://github.com/tinkerpop/gremlin/wiki/Working-with-JUNG-Algorithms/0506c193f30abe0bc18d40d7a08c9257d9311b13 > > How does it perform with just under 100k nodes on a sparse graph (3000 > relationship max, average of 100)? > > I've been doing my pagerank via the power method in rb-gsl and while it's > fine for around 10k items, it's sucking all the memory on my server when > trying to do 92k items.
Blueprints ( https://github.com/tinkerpop/blueprints/wiki/JUNG-Ouplementation ) implements the JUNG graph interface and thus, makes any Blueprints-enabled graph database into a JUNG graph. Unfortunately, JUNG was engineered from the perspective of in-memory use. As such, you will be running into memory issues on very large graphs. For example, if you have a 1million+ vertex graph and you are running PageRank on it, then your eigenvector vector is 1million+ entries. JUNG isn't serializing this vector to disk for you---its doing it all in memory. And if you don't have the memory to support a 1million+ vector (i.e. Map<Vertex,Double>), then, well.... So, in short, be wary of doing memory intensive algorithms with JUNG (i.e. understand the intermediate data structures generated from the various supported graph algorithms). For non-memory intensive algorithms like shortest path, it should meet your needs. Into the future, TinkerPop will be filling out Furnace (http://furnace.tinkerpop.com) and this package will provide memory conscious implementations of classic and non-classical graph algorithms. HTH, Marko. http://markorodriguez.com _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user