Since you're doing a depth 1 traversal please use something like this
instead:
for ( Relationship rel : graphDb.getReferenceNode().getRelationships(
Relationships.ROUTE, Direction.OUTGOING ) )
{
Node node = rel.getEndNode();
// Do stuff
}
Since a traverser keeps more memory than a simple call to getRelationships.
Another thing, are you doing any write operation in that for-loop of yours?
Also do you shut down the batch inserter and start a new
EmbeddedGraphDatabase to traverse on, or how do you get a hold of the
graphDb?
2010/7/26 Tim Jones bogol...@ymail.com
OK, I found out what's taking the time. It's iterating over the result set
of a
traverser:
// visit each Route node, and add it to the array
Traverser routes = graphDb.getReferenceNode().traverse(
Traverser.Order.BREADTH_FIRST,
StopEvaluator.DEPTH_ONE,
ReturnableEvaluator.ALL_BUT_START_NODE,
Relationships.ROUTE, Direction.OUTGOING);
for (Node node : routes)
{
// do stuff
}
The 'for' loop takes ages. There are probably 2m nodes being returned by
that
traverser at the moment, and that's only a very small subset of the data I
want
to add to the database.
is there any way to tinker with the neo4j properties or anything to improve
performance here?
Thanks
- Original Message
From: Mattias Persson matt...@neotechnology.com
To: Neo4j user discussions user@lists.neo4j.org
Sent: Sat, July 24, 2010 10:23:02 PM
Subject: Re: [Neo4j] Batch inserter shutdown taking forever
2010/7/21 Tim Jones bogol...@ymail.com
Hi,
I'm using a BatchInserter and a LuceneIndexBatchInserter to insert 5m
nodes and
5m relationships into a graph in one go. The insertion seems to work,
but
shutting down takes forever - it's been 2 hours now.
At first, the JVM gave me garbage collection exception, so I've set
the
heap to
2gb.
'top' tells me that the application is still running:
PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND
9994 tim 17 0 2620m 2.3g 238m S 99.5 39.1 115:48.84 java
but checking the filesystem by running 'ls -l' a few times doesn't
indicate
that
files are being updated.
Is this normal? Is there a way to improve performance?
No, it sounds quite weird. Any chance to have a look at your code?
I'm loading all my data in one go to ease creating the db - it's
simpler to
create it from scratch each time instead of updating an existing
database
-
so
ideally I don't want to break this job down into multiple smaller jobs
(actually, this would be OK if performance was good, but I ran into
problems
inserting data and retrieving existing nodes).
What kind of problems? could you supply code and description of your
problems?
Problems doing something similar in relational dbs. Also, the API
recommends to
optimise the batch search index before using it for lookups. I just decided
not
to take this approach.
Thanks,
Tim
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
--
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
--
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user