OK, I found out what's taking the time. It's iterating over the result set of a traverser:
// visit each Route node, and add it to the array Traverser routes = graphDb.getReferenceNode().traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.DEPTH_ONE, ReturnableEvaluator.ALL_BUT_START_NODE, Relationships.ROUTE, Direction.OUTGOING); for (Node node : routes) { // do stuff } The 'for' loop takes ages. There are probably 2m nodes being returned by that traverser at the moment, and that's only a very small subset of the data I want to add to the database. is there any way to tinker with the neo4j properties or anything to improve performance here? Thanks ----- Original Message ---- > From: Mattias Persson <matt...@neotechnology.com> > To: Neo4j user discussions <user@lists.neo4j.org> > Sent: Sat, July 24, 2010 10:23:02 PM > Subject: Re: [Neo4j] Batch inserter shutdown taking forever > > 2010/7/21 Tim Jones <bogol...@ymail.com> > > > Hi, > > > > I'm using a BatchInserter and a LuceneIndexBatchInserter to insert >5m > > nodes and > > >5m relationships into a graph in one go. The insertion seems to work, but > > shutting down takes forever - it's been 2 hours now. > > > > At first, the JVM gave me garbage collection exception, so I've set the > > heap to > > 2gb. > > > > 'top' tells me that the application is still running: > > > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 9994 tim 17 0 2620m 2.3g 238m S 99.5 39.1 115:48.84 java > > > > but checking the filesystem by running 'ls -l' a few times doesn't indicate > > that > > files are being updated. > > > > Is this normal? Is there a way to improve performance? > > > > No, it sounds quite weird. Any chance to have a look at your code? > > > > > > I'm loading all my data in one go to ease creating the db - it's simpler to > > create it from scratch each time instead of updating an existing database - > > so > > ideally I don't want to break this job down into multiple smaller jobs > > (actually, this would be OK if performance was good, but I ran into > > problems > > inserting data and retrieving existing nodes). > > > > What kind of problems? could you supply code and description of your > problems? Problems doing something similar in relational dbs. Also, the API recommends to optimise the batch search index before using it for lookups. I just decided not to take this approach. > > > > > > Thanks, > > Tim > > > > > > > > > > > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > -- > Mattias Persson, [matt...@neotechnology.com] > Hacker, Neo Technology > www.neotechnology.com > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user