Here is some interesting stats to consider. First, I split my nodes into two groups, one node with 1.4M children and the other with 3.4M children. While I do see some cache warm-up improvements, the transversal doesn't seem to scale linearly; ie the larger super-node has 2.4x more children but takes 17x longer to transverse.
neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +----------+ | count(r) | +----------+ | 1468486 | +----------+ 1 rows, 25724 ms neo4j-sh (0)$ start n=(1) match (n)-[r]-(x) return count(r) +----------+ | count(r) | +----------+ | 1468486 | +----------+ 1 rows, 19763 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +----------+ | count(r) | +----------+ | 3472174 | +----------+ 1 rows, 565448 ms neo4j-sh (0)$ start n=(2) match (n)-[r]-(x) return count(r) +----------+ | count(r) | +----------+ | 3472174 | +----------+ 1 rows, 337975 ms Any ideas on this? Andrew On 07/06/2011 09:55 AM, Peter Neubauer wrote: > Andrew, > if you upgrade to 1.4.M06, your shell should be able to do Cypher in > order to count the relationships of a node, not returning them: > > start n=(1) match (n)-[r]-(x) return count(r) > > and try that several times to see if cold caches are initially slowing > down things. > > or something along these lines. In the LS and Neoclipse the output and > visualization will be slow for that amount of data. > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - Your high performance graph database. > http://startupbootcamp.org/ - Ă–resund - Innovation happens HERE. > http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. > > > > On Wed, Jul 6, 2011 at 4:15 PM, Andrew White<li...@andrewewhite.net> wrote: >> I have a graph with roughly 10M nodes. Some of these nodes are highly >> connected to other nodes. For example I may have a single node with 1M+ >> relationships. A good analogy is a population that has a "lives-in" >> relationship to a state. Now the problem... >> >> Both neoclipse or neo4j-shell are terribly slow when working with these >> nodes. In the shell I would expect a `cd<node-id>` to be very fast, >> much like selecting via a rowid in a standard DB. Instead, I usually see >> several seconds delay. Doing a `ls` takes so long that I usually have to >> just kill the process. In fact `ls` never outputs anything which is odd >> since I would expect it to "stream" the output as it found it. I have >> very similar performance issues with neoclipse. >> >> I am using Neo4j 1.3 embedded on Ubuntu 10.04 with 4GB of RAM. >> Disclaimer, I am new to Neo4j. >> >> Thanks, >> Andrew >> _______________________________________________ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user >> > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user