Re: [Neo4j] Expected performance question

2011-11-09 Thread Andres Taylor
On Wed, Nov 9, 2011 at 9:28 AM, Hans Birkeland  wrote:

> Thanks for the quick reply! :)
>
> This is the query: start n=node(159178) match n-[*1..4]->x return count(*)


Good to know.

The reason you are getting duplicates is because you'll get the same node
in x multiple time - there might be a lot of paths from n to that
particular node. If you are interested in how many nodes can be reached
with that query, you should instead do: RETURN COUNT(DISTINCT x)

This query is even heavier than the original, though... :)

Andrés
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Expected performance question

2011-11-09 Thread Michael Hunger
Oh, and cypher supports streaming (as long as you don't sort/aggregate).

So that won't take up your memory. I would love if you could try that just 
using plain java code and iterating over the node-results and provide us with 
your performance numbers then.

Please make sure to run the test more than once, as the first test will be 
impacted by loading the scala libs and filling the neo4j caches.

Thanks a lot

Michael

Am 09.11.2011 um 09:50 schrieb Hans Birkeland:

> Ah sorry, I should have clarified - in the final application we will want to
> retrieve the nodes.  The reason we were just returning count(*) for the
> tests is that returning a large number of nodes in the web console proved
> less than ideal. :)
> 
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Expected-performance-question-tp3492892p3492961.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Expected performance question

2011-11-09 Thread Hans Birkeland
Ah sorry, I should have clarified - in the final application we will want to
retrieve the nodes.  The reason we were just returning count(*) for the
tests is that returning a large number of nodes in the web console proved
less than ideal. :)

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Expected-performance-question-tp3492892p3492961.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Expected performance question

2011-11-09 Thread Mattias Persson
Also, support for these heavy nodes with many relationships on them is
being worked some on so that you can get relationship count per
type/direction immediately instead of iterating over them, as well as only
loading relevant relationships instead of all.

In the meantime it would be better to keep a property per node containing
the number of relationships, kept up to date by your code when you
create/delete relationships. You could perhaps write that as a
TransactionEventHandler (if you're down at java level).

2011/11/9 Hans Birkeland 

> Thanks for the quick reply! :)
>
> This is the query: start n=node(159178) match n-[*1..4]->x return count(*)
>
> Hans
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Expected-performance-question-tp3492892p3492924.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Expected performance question

2011-11-09 Thread Hans Birkeland
Thanks for the quick reply! :)

This is the query: start n=node(159178) match n-[*1..4]->x return count(*)

Hans

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Expected-performance-question-tp3492892p3492924.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Expected performance question

2011-11-09 Thread Andres Taylor
Hi Hans,

First of - we have so far spent very little time so far making Cypher
awesome from a performance standpoint. Most of the energy has been put into
growing the syntax and expanding the feature set of Cypher.

The next version will hopefully involve a lot of performance work, and that
will change things.

Having said that - I'd love to know more about your query. What does it
look like?

Andrés

On Wed, Nov 9, 2011 at 9:08 AM, Hans Birkeland  wrote:

> Hi,
>
> My team has been experimenting with using neo4j for dependency tracking.
> For one of our scenarios we have run into performance issues and are
> wondering if this is expected given the data, or if we are doing something
> wrong.
>
> Our test data consists of ~370k nodes and 4.1M relationships.  Some nodes
> have a large number of relationships while many have just a few.
>
> Starting with a relatively central node, using cypher through the web
> console we query for a count of all nodes with incoming relationships from
> the start node.  With increasing numbers of intermediate nodes we get the
> following results:
>
> For nodes 1 relationship away (~5k nodes), 170 ms.
> For nodes up to 2 relationships away (~120k nodes), 2 seconds.
> For nodes up to 3 relationships away (~670k nodes, there must obviously be
> duplicates here :), 10 seconds.
> Trying to go beyond 3 maxes out available memory, takes over one core and
> takes > 30 minutes (after which we kill it).
>
> We have been testing with version 1.4.1, 1.4.2 and 1.5.M02 on a machine
> with
> 4 cores and 12 gig ram. Increasing the max java heap size to 6, 8, or 10
> gig
> did not have any noticable effect.
>
> Any feedback or hints on settings we could try to tweak would be welcome.
>
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Expected-performance-question-tp3492892p3492892.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user