I think this is the same problem that Angelos is facing, we are currently evaluating options to improve the performance on those highly connected supernodes.
A traditional option is really to split them into group or even kind of shard their relationships to a second layer. We're looking into storage improvement options as well as modifications to retrieval of that many relationships at once. Cheers Michael Am 29.06.2011 um 17:13 schrieb Niels Hoogeveen: > > I achieve more or less the same result placing the relationships in the > Timeline index, which distributes the relationships over many nodes. > There are workarounds for this issue, but I would really like to see a more > transparent solution which doesn't require special interventions for special > cases. > I don't know the inner details of the relationship store and wonder if it is > possible to partition relationships per node per relationship type per > direction. It makes intuitive sense if there are many relationships of the > same type and same direction that traversing those takes a lot of time. It > doesn't make intuitive sense that relationships with another type and/or > direction take a lot of time too. > Niels > >> Date: Wed, 29 Jun 2011 16:36:57 +0200 >> From: ntausc...@gmail.com >> To: user@lists.neo4j.org >> Subject: Re: [Neo4j] traversing densely populated nodes >> >> I focused the same problem. Nodes with a lot of relationships are very >> difficult (needs a lot of time) to be traversed. I solved the problem by >> grouping the relationships using additional nodes. The dense node then >> has only a few relationships to different 'group' nodes. Each 'group' >> node then has again many relationships to other nodes. >> >> Although this helps, it is a very ugly solution. >> >> Best regards >> >> Norbert Tausch >> >> >> Am 29.06.2011 16:07, schrieb Niels Hoogeveen: >>> Recently I have worked on loading the content of DbPedia into my database >>> and run into a performance issue. >>> My application has a meta-layer; inspired by the meta model component, but >>> rewritten in Scala. >>> All DbPedia resources are said to be an instance of "topic", >>> creating a relationship from that resource node to the node that describes >>> the topic class. >>> This makes the "topic class" node of course densely populated. >>> The "topic class" node has relationships other than "HAS_INSTANCE", >>> for example "SUB_CLASS_OF", which states that the "topic class" node is a >>> subclass of "typable". >>> When trying to retrieve the "SUB_CLASS_OF" relationships of the "topic >>> class" node performance degrades enormously. >>> >>> It looks (please correct me if I am wrong in my assumption) as if all >>> relationships are being scanned >>> to filter out the "SUB_CLASS_OF" relationships (of which there are very >>> few, especially compared to the "HAS_INSTANCE" relationship) >>> I ended up placing all "HAS_INSTANCE" into the Timeline index from >>> Neo4j-graph-collections for two reasons,it's nice to know when a resource >>> became an instance of a class (bonus), and to make sure that not a single >>> nodebecomes heavily populated. >>> So far so good, but delving deeper into the Timeline index, I notice that >>> the relationship between an entry nodeand the root of the tree is partially >>> established by the use of a property on "entry node" which names the >>> timeline index. >>> The simplest way to establish the relationship between an "entry node" and >>> the tree root is by means of a Lucene index lookup. >>> This is of course not a very fastest solution and actually would mean the >>> same as adding a property to the "resource node", listing the classes a >>> resource is an instance of. >>> Adding a relationship from "entry node" to "tree root" in the Timeline >>> component would create yet another densely populated nodein the database >>> (in this case the tree root). >>> Is there a way out of this situation? >>> Would it be possible to partition the relationships in the database per >>> relationship type per direction, so densely populated nodescan get >>> traversed fast for those relationships types that are sparsely populated? >>> Niels >>> >>> _______________________________________________ >>> Neo4j mailing list >>> User@lists.neo4j.org >>> https://lists.neo4j.org/mailman/listinfo/user >> _______________________________________________ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user