John,

no problem :)

You pointed out both problems: 
- cold caches
- lots of rels on the one side

There are some performance implications of loading millions of rels from disk. 
We'll be working on improving those in 1.5.

The easiest way to solve that is to switch start and end-node which you already 
did. It is much easier for you _with domain knowledge_ about the graph than for 
the algorithm.

Especially if you have > 1 hop paths.

There might be ways in improving the algorithm, e.g. by iterating both sides at 
the same time, which would lead to the end with the fewer relationships being 
exhausted and resolved first. 

How large is your graph at all? And how is it structured, e.g. how many rels do 
the other 9 nodes have that your second node points to?

What are both times in a second or third run (cached) ?

Relationships are stored as a chained/linked list in the disk format so it is 
not possible to know how many there are per node.

Cheers

Michael

Am 23.07.2011 um 04:42 schrieb John cyuczieekc:

> Hey guys, me bugging you again :)
> 
> (This whole thing is kind of based on the lack of being able to get the
> number of relationships a node has)
> 
> If I have two nodes, and the first one has 1 million outgoing relationships
> of the type X to 1 million unique/different nodes,
> and the second node has 10 incoming relationships of type X (same type) of
> which one is from the first node,
> then using GraphAlgoFactory.shortestPath  (or suggest a better way?)
> How can I tell neo4j to iterate the search on the second node's incoming
> rels simply because it has 10 relationships instead of 1 million, in order
> to check if each relationship is in the form of firstNode-->secondNode ?
> 
> For the case when first node has 100,000 relationships and second node has
> 10,
> it takes *1.7 seconds* for shortestPath to find the only one link between
> them using:
> 
> final PathFinder<Path> finder = GraphAlgoFactory.shortestPath(
> Traversal.expanderForTypes( rel, Direction.OUTGOING  ), 1 );
> final Path foundPath = finder.findSinglePath( *one, two* );
> 
> I can put Direction.*BOTH *and get the same amount of time
> *Path from one to two: (one)-->(two) timedelta=1,862,726,634 ns*
> 
> *BUT*, get this: if I swap the nodes:
> finder.findSinglePath(* two, one*);
> and i use either Direction.INCOMING or Direction.*BOTH  *(which makes sense
> for the second node ,right) then I get *20ms* the time until it finishes...
> *Path from one to two: (two)<--(one) timedelta=20,830,111 ns*
> 
> (both cases are without data being priorly cached)
> 
> I was expecting it to act like this: (but only when using Direction.BOTH)
>  see which node has the least number of relationships and iterate on those,
> but this would work if findSinglePath would be made for depth 1 (aka
> particular case), but as I read "Tries to find a single path between startand
> end nodes." then it makes sense to me why it works like it does... that is,
> iterate on relationships from start node, rather than from end node... but
> I'm not sure if it would *not *make sense to iterate on the end node instead
> of start node, when knowing that end node has less relationships, for make
> the search faster (well at least if depth is one) - I didn't look into how
> neo4j actually does stuff yet :D
> 
> anyway, it's fairly clear to me that I could make a simple wrapper method to
> make this kind of search faster, *IF* I had the ability to know how many
> relationships each node has, so I can call findSinglePath  with the first
> param being the node with the least relationship count :) But as I
> understood it, it's not possible to find how many rels a node has... gimme
> feat! :)) [by not possible I mean, without having to iterate thru all and
> count them, which would make the use case here obsolete]
> 
> PS: clearly all the text I wrote here would benefit from being represented
> by a graph, just think about all those grouping with autohiding the ie. "[]"
> and all kinds of stuff... heh
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to