Michael,

 

The issue I am refering to does not pertain to traversing many relations at 
once 

but the impact many relationship of one type have on relationships 

of another type on the same node.

 

Example:

 

A topic class has 2 million outgoing relationships of type "HAS_INSTANCE" and 

has 3 outgoing relationships of type "SUB_CLASS_OF".

 

Fetching the 3 relations of type "SUB_CLASS_OF" takes very long,

I presume due to the presence of the 2 million other relationships.

 

I have no need to ever fetch the "HAS_INSTANCE" relationships from

the topic node. That relation is always traversed from the other direction.

 

I do want to know the class of a topic instance, leading to he topic class,

but have no real interest ever to traverse all topic instance from  the topic 

class (at least not directly.. i do want to know the most recent addition,

and that's what I use the timeline index for).

 

Niels
 

> From: michael.hun...@neotechnology.com
> Date: Wed, 29 Jun 2011 17:50:08 +0200
> To: user@lists.neo4j.org
> Subject: Re: [Neo4j] traversing densely populated nodes
> 
> I think this is the same problem that Angelos is facing, we are currently 
> evaluating options to improve the performance on those highly connected 
> supernodes.
> 
> A traditional option is really to split them into group or even kind of shard 
> their relationships to a second layer.
> 
> We're looking into storage improvement options as well as modifications to 
> retrieval of that many relationships at once.
> 
> Cheers
> 
> Michael
> 
> Am 29.06.2011 um 17:13 schrieb Niels Hoogeveen:
> 
> > 
> > I achieve more or less the same result placing the relationships in the 
> > Timeline index, which distributes the relationships over many nodes. 
> > There are workarounds for this issue, but I would really like to see a more 
> > transparent solution which doesn't require special interventions for 
> > special cases. 
> > I don't know the inner details of the relationship store and wonder if it 
> > is possible to partition relationships per node per relationship type per 
> > direction. It makes intuitive sense if there are many relationships of the 
> > same type and same direction that traversing those takes a lot of time. It 
> > doesn't make intuitive sense that relationships with another type and/or 
> > direction take a lot of time too.
> > Niels
> > 
> >> Date: Wed, 29 Jun 2011 16:36:57 +0200
> >> From: ntausc...@gmail.com
> >> To: user@lists.neo4j.org
> >> Subject: Re: [Neo4j] traversing densely populated nodes
> >> 
> >> I focused the same problem. Nodes with a lot of relationships are very
> >> difficult (needs a lot of time) to be traversed. I solved the problem by
> >> grouping the relationships using additional nodes. The dense node then
> >> has only a few relationships to different 'group' nodes. Each 'group'
> >> node then has again many relationships to other nodes.
> >> 
> >> Although this helps, it is a very ugly solution.
> >> 
> >> Best regards
> >> 
> >> Norbert Tausch
> >> 
> >> 
> >> Am 29.06.2011 16:07, schrieb Niels Hoogeveen:
> >>> Recently I have worked on loading the content of DbPedia into my database 
> >>> and run into a performance issue.
> >>> My application has a meta-layer; inspired by the meta model component, 
> >>> but rewritten in Scala.
> >>> All DbPedia resources are said to be an instance of "topic", 
> >>> creating a relationship from that resource node to the node that 
> >>> describes the topic class.
> >>> This makes the "topic class" node of course densely populated.
> >>> The "topic class" node has relationships other than "HAS_INSTANCE", 
> >>> for example "SUB_CLASS_OF", which states that the "topic class" node is a 
> >>> subclass of "typable". 
> >>> When trying to retrieve the "SUB_CLASS_OF" relationships of the "topic 
> >>> class" node performance degrades enormously. 
> >>> 
> >>> It looks (please correct me if I am wrong in my assumption) as if all 
> >>> relationships are being scanned 
> >>> to filter out the "SUB_CLASS_OF" relationships (of which there are very 
> >>> few, especially compared to the "HAS_INSTANCE" relationship)
> >>> I ended up placing all "HAS_INSTANCE" into the Timeline index from 
> >>> Neo4j-graph-collections for two reasons,it's nice to know when a resource 
> >>> became an instance of a class (bonus), and to make sure that not a single 
> >>> nodebecomes heavily populated.
> >>> So far so good, but delving deeper into the Timeline index, I notice that 
> >>> the relationship between an entry nodeand the root of the tree is 
> >>> partially established by the use of a property on "entry node" which 
> >>> names the timeline index.
> >>> The simplest way to establish the relationship between an "entry node" 
> >>> and the tree root is by means of a Lucene index lookup.
> >>> This is of course not a very fastest solution and actually would mean the 
> >>> same as adding a property to the "resource node", listing the classes a 
> >>> resource is an instance of.
> >>> Adding a relationship from "entry node" to "tree root" in the Timeline 
> >>> component would create yet another densely populated nodein the database 
> >>> (in this case the tree root). 
> >>> Is there a way out of this situation? 
> >>> Would it be possible to partition the relationships in the database per 
> >>> relationship type per direction, so densely populated nodescan get 
> >>> traversed fast for those relationships types that are sparsely populated?
> >>> Niels
> >>> 
> >>> _______________________________________________
> >>> Neo4j mailing list
> >>> User@lists.neo4j.org
> >>> https://lists.neo4j.org/mailman/listinfo/user
> >> _______________________________________________
> >> Neo4j mailing list
> >> User@lists.neo4j.org
> >> https://lists.neo4j.org/mailman/listinfo/user
> > 
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
> 
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
                                          
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to