Can't you rebalance the tree by just marking nodes as deleted? (setting 
properties or collecting their id's somewhere?)

Michael

Am 09.07.2011 um 00:29 schrieb Craig Taverner:

> Not sure if this will be the same for you, but we were not able to use the
> RTree in batch inserter mode, since the batch inserter does not support node
> deletion and that is required during tree rebalancing. So our OSM importer
> in batch mode actually imports everything first, must like you describe, and
> the indexes in the normal non-batch mode afterwards.
> 
> As Peter says, the normal API has been catching up with the batch API for
> many cases, and we have also moved our OSM importer to the normal API. I'm
> guessing the same might happen everywhere.
> 
> On Fri, Jul 8, 2011 at 11:56 AM, Niels Hoogeveen
> <pd_aficion...@hotmail.com>wrote:
> 
>> 
>> I think the best solution to use the batch inserter in conjunction with
>> indexed relationships is to first run the batch inserter and temporarily
>> store the relationships to be indexed on specifically designated nodes. Make
>> sure that each of those nodes does not have more than a set maximum number
>> of relationships. After batch insert, the temporary relationships can be
>> traversed and inserted in the appropriate index trees.
>> The nicest solution would be to have a class BatchInsertIndexedRelationship
>> which has more or less the same interface as the regular IndexedRelationship
>> class, which temporarily stores relationships without indexing them and when
>> batch insert is finished, a call to optimize() will do the actual index
>> inserts and remove the temporarily stored relationships.
>> 
>> Niels
>>> From: michael.hun...@neotechnology.com
>>> Date: Fri, 8 Jul 2011 09:56:30 +0200
>>> To: user@lists.neo4j.org
>>> Subject: Re: [Neo4j] Indexed relationships
>>> 
>>> But I'm not sure when in 1.5 this is going to be addressed.
>>> 
>>> So if Niels could look at the effort needed and perhaps pair with Andrew
>> on getting this done (or at least give him some pointers to implement it)
>> this would be great!
>>> 
>>> Cheers
>>> 
>>> Michael
>>> 
>>> Am 08.07.2011 um 07:07 schrieb Peter Neubauer:
>>> 
>>>> Andrew,
>>>> I think in the long run we are moving away from the BatchInserter, and
>>>> instead going to find better ways to avoid disk IO, e.g. ordered
>>>> writes per transaction, so that normal mode can be sped up to match
>>>> the BatchInserter, that hopefully will make it into 1.5, so I would
>>>> not worry about BatchInserter right now.
>>>> 
>>>> Cheers,
>>>> 
>>>> /peter neubauer
>>>> 
>>>> GTalk:      neubauer.peter
>>>> Skype       peter.neubauer
>>>> Phone       +46 704 106975
>>>> LinkedIn   http://www.linkedin.com/in/neubauer
>>>> Twitter      http://twitter.com/peterneubauer
>>>> 
>>>> http://www.neo4j.org               - Your high performance graph
>> database.
>>>> http://startupbootcamp.org/    - Ă–resund - Innovation happens HERE.
>>>> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing
>> party.
>>>> 
>>>> 
>>>> 
>>>> On Fri, Jul 8, 2011 at 4:27 AM, Andrew White <li...@andrewewhite.net>
>> wrote:
>>>>> I don't know if this is the right place to ask this; but does it
>> support
>>>>> a batch insert mode? When I am bulk loading data I don't have Node
>>>>> objects to pass around, only node ids.
>>>>> 
>>>>> Thanks,
>>>>> Andrew
>>>>> 
>>>>> On 07/07/2011 06:19 PM, Niels Hoogeveen wrote:
>>>>>> I created a wiki page for indexed relationships in the Git repo, see:
>> 
>> https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
>>>>>> 
>>>>>>> From: michael.hun...@neotechnology.com
>>>>>>> Date: Thu, 7 Jul 2011 22:53:05 +0200
>>>>>>> To: user@lists.neo4j.org
>>>>>>> Subject: Re: [Neo4j] Indexed relationships
>>>>>>> 
>>>>>>> Could you put these code examples into the Readme for the project or
>> on a wiki page?
>>>>>>> 
>>>>>>> Am 07.07.2011 um 22:11 schrieb Niels Hoogeveen:
>>>>>>> 
>>>>>>>> IndexedRelationship and IndexedRelationshipExpander are now in Git.
>> See:
>> https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship
>>>>>>>> An example:
>>>>>>>> class IdComparator implements java.util.Comparator<Node>{
>>>>>>>>  public int compare(Node n1, Node n2){
>>>>>>>>    long l1 = Long.reverse(n1.getId());
>>>>>>>>    long l2 = Long.reverse(n2.getId());
>>>>>>>>    if(l1 == l2) return 0;
>>>>>>>>    else if(l1<  l2) return -1;
>>>>>>>>    else return 1;
>>>>>>>>  }
>>>>>>>> }static enum RelTypes implements RelationshipType{
>>>>>>>>  DIRECT_RELATIONSHIP,
>>>>>>>>  INDEXED_RELATIONSHIP,
>>>>>>>> };
>>>>>>>> Node indexedNode = graphDb().createNode();
>>>>>>>> IndexedRelationship ir = new
>> IndexedRelationship(RelTypes.INDEXED_RELATIONSHIP, Direction.OUTGOING, new
>> IdComparator(), true, indexedNode, graphDb());
>>>>>>>> 
>>>>>>>> Node n1 = graphDb().createNode();
>>>>>>>> n1.setProperty("name", "n1");
>>>>>>>> Node n2 = graphDb().createNode();
>>>>>>>> n2.setProperty("name", "n2");
>>>>>>>> Node n3 = graphDb().createNode();
>>>>>>>> n3.setProperty("name", "n3");
>>>>>>>> Node n4 = graphDb().createNode();
>>>>>>>> n4.setProperty("name", "n4");
>>>>>>>> 
>>>>>>>> indexedNode.createRelationshipTo(n1, RelTypes.DIRECT_RELATIONSHIP);
>>>>>>>> indexedNode.createRelationshipTo(n3, RelTypes.DIRECT_RELATIONSHIP);
>>>>>>>> ir.createRelationshipTo(n2);
>>>>>>>> ir.createRelationshipTo(n4);
>>>>>>>> 
>>>>>>>> IndexedRelationshipExpander re1 = new
>> IndexedRelationshipExpander(graphDb(), Direction.OUTGOING,
>> RelTypes.DIRECT_RELATIONSHIP);
>>>>>>>> IndexedRelationshipExpander re2 = new
>> IndexedRelationshipExpander(graphDb(), Direction.OUTGOING,
>> RelTypes.INDEXED_RELATIONSHIP);
>>>>>>>> 
>>>>>>>> for(Relationship rel: re1.expand(indexedNode)){
>>>>>>>>  System.out.println(rel.getEndNode().getProperty("name"));
>>>>>>>> }
>>>>>>>> for(Relationship rel: re2.expand(indexedNode)){
>>>>>>>>  System.out.println(re2.getEndNode().getProperty("name"));
>>>>>>>> }
>>>>>>>>> From: pd_aficion...@hotmail.com
>>>>>>>>> To: user@lists.neo4j.org
>>>>>>>>> Date: Thu, 7 Jul 2011 16:55:36 +0200
>>>>>>>>> Subject: Re: [Neo4j] Indexed relationships
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Hi Michael,I realize that the implementation of
>> IndexedRelationship can in fact support returning relationships, and I have
>> a preliminary version running locally now.The returned relationships can
>> support all methods of the Relationship interface, returning the node
>> pointing to the treeRoot as the startNode, and returning the node set as the
>> key_value as the endNode.All relationship properties will be stored on the
>> KEY_VALUE relationship pointing to the endNode.There is one caveat to this
>> solution, the returned relationships cannot support the getId() method,and
>> will throw an UnsupportedOperationException when being
>> called.IndexedRelationship will implement Iterable<Relationship>.With these
>> changes, it is possible to create an Expander and I am working right now to
>> implement that.Niels
>>>>>>>>> 
>>>>>>>>>> From: pd_aficion...@hotmail.com
>>>>>>>>>> To: user@lists.neo4j.org
>>>>>>>>>> Date: Thu, 7 Jul 2011 14:46:35 +0200
>>>>>>>>>> Subject: Re: [Neo4j] Indexed relationships
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Hi Michael,
>>>>>>>>>> 
>>>>>>>>>> I haven't yet worked on an example.
>>>>>>>>>> 
>>>>>>>>>> There are tests for the SortedTree implementation,
>>>>>>>>>> but didn't add those to the IndexedRelationship class,
>>>>>>>>>> which is simply a wrapper around SortedTree.
>>>>>>>>>> Having a test would have caught the error
>>>>>>>>>> that no relationship to the treeNode was created
>>>>>>>>>> (fixed that bug and pushed it to Git)
>>>>>>>>>> (note to self: always create a unit test,
>>>>>>>>>> especially when code seems trivial).
>>>>>>>>>> 
>>>>>>>>>> There is no relationship expander that uses this.
>>>>>>>>>> The RelationshipExpander has a method Iterable<Relationship>
>> expand(Node node)
>>>>>>>>>> which cannot be supported, since there is no direct relationship
>> from startnode to endnode.
>>>>>>>>>> Instead there is a path through the index tree.
>>>>>>>>>> 
>>>>>>>>>> It's not possible to support the original relationship-traversal
>> API
>>>>>>>>>> since the IndexedRelationship class is not a wrapper around a
>> node,
>>>>>>>>>> but a wrapper around the relationships of a certain
>> RelationshipType in the OUTGOING direction.
>>>>>>>>>> 
>>>>>>>>>> As to the name of the class.
>>>>>>>>>> It is essentially an indexed relationship,
>>>>>>>>>> and not just a solution to the densely-connected-node problem.
>>>>>>>>>> An indexed relationship can also be used to maintain
>>>>>>>>>> a sorted set of relationships of any size,
>>>>>>>>>> and can be used to guarantee unicity constraints. Niels
>>>>>>>>>>> From: michael.hun...@neotechnology.com
>>>>>>>>>>> Date: Thu, 7 Jul 2011 13:27:00 +0200
>>>>>>>>>>> To: user@lists.neo4j.org
>>>>>>>>>>> Subject: Re: [Neo4j] Indexed relationships
>>>>>>>>>>> 
>>>>>>>>>>> Good work,
>>>>>>>>>>> 
>>>>>>>>>>> do you have an example ready (and/or some tests that show how it
>> works/is used) ?
>>>>>>>>>>> 
>>>>>>>>>>> In creation, manual traversal and automatic traversal (i.e. is
>> there a RelationshipExpander that uses it).
>>>>>>>>>>> 
>>>>>>>>>>> And in the constructor if there is no relationship to the
>> treeNode, you create a new one, but that new treeNode is not connected to
>> the actual node?
>>>>>>>>>>> 
>>>>>>>>>>> I'm not sure if it should support the original
>> relationship-traversal API / methods (getRelationships(Dir,type), etc).
>>>>>>>>>>> 
>>>>>>>>>>> Perhaps that IndexedRelationship should rather be just a wrapper
>> around a SuperNode ? So probably rename it to "SuperNode(Wrapper) or
>> HeavilyConnectedNode(Wrapper) ?)
>>>>>>>>>>> 
>>>>>>>>>>> Cheers
>>>>>>>>>>> 
>>>>>>>>>>> Michael
>>>>>>>>>>> 
>>>>>>>>>>> Am 07.07.2011 um 12:51 schrieb Niels Hoogeveen:
>>>>>>>>>>> 
>>>>>>>>>>>> Finished the implementation of indexed relationships. The graph
>> collections component now contains the package
>> https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/indexedrelationship,
>> containing the IndexedRelationship class.
>>>>>>>>>>>> This class can be used instead of regular relationships
>> when:relationships need to be stored in a particular sort ordera unicity
>> constraint needs to be guaranteed nodes become densely populated with
>> relationships.
>>>>>>>>>>>> The implementation is traverser friendly. Given a start nodes
>> all end nodes can be found by following four relationships types in outgoing
>> direction. Given an end node the start node can be found by following these
>> four relationship types in incoming direction. Of course this functionality
>> is also covered in the API.
>>>>>>>>>>>> Niels
>>>>>>>>>>>> 
>>>>>>>>>>>>> From: pd_aficion...@hotmail.com
>>>>>>>>>>>>> To: user@lists.neo4j.org
>>>>>>>>>>>>> Date: Thu, 7 Jul 2011 02:36:29 +0200
>>>>>>>>>>>>> Subject: Re: [Neo4j] Indexed relationships
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Pushed SortedTree to Git after adding a unit test and doing
>> some debugging.
>>>>>>>>>>>>> TODO:Add API for indexed relationships using SortedTree as the
>> implementation.Make SortedTree thread safe.
>>>>>>>>>>>>> With regard to the latter issue. I am considering the
>> following solution. Acquire a lock (delete a non existent property) on the
>> node that points to the root of the tree at the start of AddNode, RemoveNode
>> and Delete. No other node in the SortedTree is really stable, even the
>> rootnode may be moved down, turning another node into the new rootnode,
>> while after a couple of remove actions the original rootnode may even be
>> deleted.
>>>>>>>>>>>>> Locking the node pointing to the rootnode, prevents all other
>> threads/transactions from updating the tree. This may seem restrictive, but
>> a single new entry or a single removal may in fact have impact on much of
>> the tree, due to balancing. More selective locking would require a
>> prebalancing tree walk, determining the affected subtrees, lock them and
>> once every affected subtree is locked, perform the actual balancing.
>>>>>>>>>>>>> Please let me hear if there are any objections to locking the
>> node pointing to the tree as the a solution to make SortedTree thread safe.
>>>>>>>>>>>>> Niels
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Date: Tue, 5 Jul 2011 08:27:57 +0200
>>>>>>>>>>>>>> From: neubauer.pe...@gmail.com
>>>>>>>>>>>>>> To: user@lists.neo4j.org
>>>>>>>>>>>>>> Subject: Re: [Neo4j] Indexed relationships
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Great work Nils!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> /peter
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Sent from my phone.
>>>>>>>>>>>>>> On Jul 4, 2011 11:39 PM, "Niels Hoogeveen"<
>> pd_aficion...@hotmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Made some more changes to the SortedTree implementation.
>> Previously
>>>>>>>>>>>>>> SortedTree would throw an exception if a duplicate entry was
>> being added.
>>>>>>>>>>>>>>> I changed SortedTree to allow a key to point to more than
>> one node, unless
>>>>>>>>>>>>>> the SortedTree is created as a unique index, in which case an
>> exception is
>>>>>>>>>>>>>> raised when an attempt is made to add a node to an existing
>> key entry.
>>>>>>>>>>>>>>> A SortedTree once defined as unique can not be changed to a
>> non-unique
>>>>>>>>>>>>>> index or vice-versa.
>>>>>>>>>>>>>>> SortedTrees now have a name, which is stored in the a
>> property of the
>>>>>>>>>>>>>> TREE_ROOT relationship and in the KEY_VALUE relationship (a
>> new relationship
>>>>>>>>>>>>>> that points from the SortedTree to the Node inserted in the
>> SortedTree). The
>>>>>>>>>>>>>> name of a SortedTree can not be changed.
>>>>>>>>>>>>>>> SortedTrees now store the class of the Comparator, so a
>> SortedTree, once
>>>>>>>>>>>>>> created, can not be used with a different Comparator.
>>>>>>>>>>>>>>> SortedTree is now an Iterable, making it possible to use it
>> in a
>>>>>>>>>>>>>> foreach-loop.
>>>>>>>>>>>>>>> Since there are as of yet, no unit tests for SortedTree, I
>> will create
>>>>>>>>>>>>>> those first before pushing my changes to Git. Preliminary
>> results so far are
>>>>>>>>>>>>>> good. I integrated the changes in my own application and it
>> seems to work
>>>>>>>>>>>>>> fine.
>>>>>>>>>>>>>>> Todo:
>>>>>>>>>>>>>>> Decide on an API for indexed relationships. (Community input
>> still
>>>>>>>>>>>>>> welcome).Write unit tests.Make SortedTree thread safe
>> (Community help still
>>>>>>>>>>>>>> welcome).
>>>>>>>>>>>>>>> Niels
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> From: pd_aficion...@hotmail.com
>>>>>>>>>>>>>>>> To: user@lists.neo4j.org
>>>>>>>>>>>>>>>> Date: Mon, 4 Jul 2011 15:49:45 +0200
>>>>>>>>>>>>>>>> Subject: Re: [Neo4j] Indexed relationships
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I forgot to add another recurrent issue that can be solved
>> with indexed
>>>>>>>>>>>>>> relationships: guaranteed unicity constraints.
>>>>>>>>>>>>>>>>> From: pd_aficion...@hotmail.com
>>>>>>>>>>>>>>>>> To: user@lists.neo4j.org
>>>>>>>>>>>>>>>>> Date: Mon, 4 Jul 2011 01:55:08 +0200
>>>>>>>>>>>>>>>>> Subject: [Neo4j] Indexed relationships
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> In the thread [Neo4j] traversing densely populated nodes
>> we discussed
>>>>>>>>>>>>>> the problems arising when large numbers of relationships are
>> added to the
>>>>>>>>>>>>>> same node.
>>>>>>>>>>>>>>>>> Over the weekend, I have worked on a solution for the
>>>>>>>>>>>>>> dense-relationship-nodes using SortedTree in the
>> neo-graph-collections
>>>>>>>>>>>>>> component. After some minor tweaks to the implementation of
>> SortedTree, I
>>>>>>>>>>>>>> have managed to get a workable solution, where two nodes are
>> not directly
>>>>>>>>>>>>>> linked by a relationship, but by means of a BTree (entirely
>> stored in the
>>>>>>>>>>>>>> graph).
>>>>>>>>>>>>>>>>> Before continuing this work, I'd like to have a discussion
>> about
>>>>>>>>>>>>>> features, since what we have now is not just a solution for
>> the dense
>>>>>>>>>>>>>> populated node issue, but is actually a full fledges indexed
>> relationship,
>>>>>>>>>>>>>> which makes it suitable for other purposes too.
>>>>>>>>>>>>>>>>> An indexed relationship can for example be used to
>> maintain a sorted
>>>>>>>>>>>>>> set of relationships in the graph, that is not necessarily
>> huge, but large
>>>>>>>>>>>>>> enough to make sorting on internal memory too expensive an
>> operation, or
>>>>>>>>>>>>>> situations where only one out of a large number of
>> relationships is actually
>>>>>>>>>>>>>> traversed in most cases.
>>>>>>>>>>>>>>>>> There are probably more use cases for in-graph indexed
>> relationships,
>>>>>>>>>>>>>> so I'd like to know what features are desirable and what API
>> would Neo4J
>>>>>>>>>>>>>> users appreciate.
>>>>>>>>>>>>>>>>> P.S. I still think it would be good to consider, if
>> technically
>>>>>>>>>>>>>> possible, partitioning the relationship store per
>> relationship type and per
>>>>>>>>>>>>>> direction. The indexed relationship solution works, but is of
>> course slower
>>>>>>>>>>>>>> than a direct relationship, both with respect to insert time
>> and traversal
>>>>>>>>>>>>>> time. If dense relationships are never traversed going out of
>> the dense
>>>>>>>>>>>>>> node, the extra structure maintained by the BTree is only
>> extra burden.
>>>>>>>>>>>>>>>>> P.P.S. If there are people with experience to make an
>> implementation
>>>>>>>>>>>>>> thread safe, please volunteer to help make the implementation
>> production
>>>>>>>>>>>>>> proof.
>>>>>>>>>>>>>>>>> Niels
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> Neo4j mailing list
>>>>>>>>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>> Neo4j mailing list
>>>>>>>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>> Neo4j mailing list
>>>>>>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>> Neo4j mailing list
>>>>>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>>>>>> 
>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>> Neo4j mailing list
>>>>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>>>>> 
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> Neo4j mailing list
>>>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> Neo4j mailing list
>>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>>> 
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Neo4j mailing list
>>>>>>>>>> User@lists.neo4j.org
>>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>>> 
>>>>>>>>> _______________________________________________
>>>>>>>>> Neo4j mailing list
>>>>>>>>> User@lists.neo4j.org
>>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Neo4j mailing list
>>>>>>>> User@lists.neo4j.org
>>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>>> _______________________________________________
>>>>>>> Neo4j mailing list
>>>>>>> User@lists.neo4j.org
>>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Neo4j mailing list
>>>>>> User@lists.neo4j.org
>>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Neo4j mailing list
>>>>> User@lists.neo4j.org
>>>>> https://lists.neo4j.org/mailman/listinfo/user
>>>>> 
>>>> _______________________________________________
>>>> Neo4j mailing list
>>>> User@lists.neo4j.org
>>>> https://lists.neo4j.org/mailman/listinfo/user
>>> 
>>> _______________________________________________
>>> Neo4j mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>> 
>> _______________________________________________
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>> 
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to