the problem you see is GC trashing, the most CPU time is spent running
GC since most of the heap is occupied by objects. A traverser keeps
track of which nodes it has visited and for a big traversal that can
be a problem. A better solution for you here would be to call:

   startNode.getRelationships

directly instead since iterating over relationships like that doesn't
keep such memory.

We also just created a new traversal framework which deals with this
issue, among other things.

2010/4/29, Bhuvan <[email protected]>:
> Hello,
>
> We are trying to explore Neo4j for a huge number of graph nodes and
> relations.
> Let's say there are about 6 million users across the world and 6 million
> user address elements like postal-code/city/state/country etc.
> Now I am trying to get all users in a given country which has about 3
> million users. What I found is that traverser returned about 0.6 million
> nodes quickly and thereafter it slows down as shown below:
> --------------------------
>  INFO  [2010-04-28 20:15:13,082] [test.TraversalTest] - Starting...
>  INFO  [2010-04-28 20:15:39,030] [test.TraversalTest] – 100,000
>  INFO  [2010-04-28 20:15:41,734] [test.TraversalTest] – 200,000
>  INFO  [2010-04-28 20:15:44,022] [test.TraversalTest] – 300,000
>  INFO  [2010-04-28 20:15:51,353] [test.TraversalTest] – 400,000
>  INFO  [2010-04-28 20:15:53,433] [test.TraversalTest] – 500,000
>  INFO  [2010-04-28 20:15:55,721] [test.TraversalTest] – 600,000
>
>  INFO  [2010-04-28 20:20:54,433] [test.TraversalTest] – 700,000
>  INFO  [2010-04-28 20:25:32,407] [test.TraversalTest] – 800,000
>  INFO  [2010-04-28 20:30:33,274] [test.TraversalTest] – 900,000
>  INFO  [2010-04-28 20:35:26,405] [test.TraversalTest] – 1,000,000
>  INFO  [2010-04-28 20:39:17,099] [test.TraversalTest] – 1,100,000
>  INFO  [2010-04-28 20:42:52,856] [test.TraversalTest] – 1,200,000
>  INFO  [2010-04-28 20:46:57,318] [test.TraversalTest] – 1,300,000
>  INFO  [2010-04-28 20:50:58,397] [test.TraversalTest] – 1,400,000
>  INFO  [2010-04-28 20:54:53,570] [test.TraversalTest] – 1,500,000
> --------------------------
> The number in the last of line above shows the returned node count after
> every 100,000 nodes which is printed in the for-loop.
> I used following traverser:
>
> Traverser traverser = startNode.traverse(Traverser.Order.BREADTH_FIRST,
>                         StopEvaluator.DEPTH_ONE,
>                         ReturnableEvaluator.ALL_BUT_START_NODE,
>                         TestRelationshipType.HAS_COUNTRY,
> Direction.INCOMING);
>
> where startNode above is country node to which users are related with
> HAS_COUNTRY relation.
>
> My question is why it slows down in returning nodes after a while and if
> there is something which can be done to avoid it?
>
> Thanks
> Bhuvan
>
>
> _______________________________________________
> Neo mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>


-- 
Mattias Persson, [[email protected]]
Hacker, Neo Technology
www.neotechnology.com
_______________________________________________
Neo mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to