Niels, Bryce, great! Gave you access to the repo, please merge :) /peter
On Thu, Sep 8, 2011 at 2:32 PM, Niels Hoogeveen <pd_aficion...@hotmail.com> wrote: > > Excellent... I did a code review and think this is a huge improvement over > what we had. > Peter, can you pull these changes, I no longer have the privs to do so. > Niels > >> Date: Thu, 8 Sep 2011 17:24:44 +1200 >> From: bryc...@gmail.com >> To: user@lists.neo4j.org >> Subject: Re: [Neo4j] Issues with IndexedRelationship >> >> I have made the changes in regards to SortedTree in regards to relationships >> vs nodes, and have got all the tests passing. The changes are pushed up to >> my github account (and pull request has been raised). >> >> The changes can be seen here: >> https://github.com/brycenz/graph-collections >> >> On Thu, Sep 8, 2011 at 3:41 PM, Bryce <bryc...@gmail.com> wrote: >> >> > Another thought if there is going to be a larger refactor of the code is >> > whether the indexing mechanism should be broken out as a strategy for the >> > IndexedRelationship. At present it is tied to SortedTree, but if an >> > interface was extracted out that had addNode, removeNode, iterator, and >> > isUniqueIndex then other indexing implementations could be used in certain >> > cases. >> > >> > The particular other implementation I am currently thinking of that could >> > be of use to me would be a paged linked list. So that would have a linked >> > list of pages, each with min < x < max KEY_VALUE (or equivalent) >> > relationships. I think that could work quite well for the situation where >> > the index is descending date ordered, and generally just appended at the >> > most recent end, and results are retrieved in a paged manner generally from >> > near the most recent. >> > >> > But more to the point there could be any number of implementations that >> > would be good for given different situations. >> > >> > That does bring up a question though, there was some discussion a while ago >> > about some functionality along the lines of IndexedRelationship being >> > pulled >> > into the core, so is that overkill for now if there is going to be another >> > core offering later? >> > >> > >> > On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen <pd_aficion...@hotmail.com >> > > wrote: >> > >> >> >> >> I think we don't have to worry about backwards compatibility much yet. >> >> There has not been a formal release of the component, so if there are >> >> people >> >> using the software, they will accept that they are bleeding edgers. >> >> Indeed addNode should return the KEY_VALUE relationship and I think we >> >> should change the signature of SortedTree to turn it into >> >> Iterable<Relationship>. No need to maintain a Node iterator, the node is >> >> always one getEndNode away. >> >> Niels >> >> >> >> > Date: Thu, 8 Sep 2011 14:17:59 +1200 >> >> > From: bryc...@gmail.com >> >> > To: user@lists.neo4j.org >> >> > Subject: Re: [Neo4j] Issues with IndexedRelationship >> >> > >> >> > Will have to experiment with changing my id's to be stored as longs, it >> >> does >> >> > make perfect sense really that it would be better. Thanks for the hint. >> >> > >> >> > In regards to SortedTree returning the KEY_VALUE relationship instead of >> >> the >> >> > end Node, I had thought of that too, and it would definitely help. >> >> Could >> >> > end up being a significant change to SortedTree though, e.g.: >> >> > sortedTree.addNode( node ); >> >> > Would need to return the KEY_VALUE relationship instead of a boolean. >> >> Which >> >> > not knowing where else SortedTree is used could be a large change? >> >> > >> >> > Maybe SortedTree would have two iterator's available a key_value >> >> > relationship iterator, and a node iterator. Having a quick look at it >> >> now >> >> > it seems that it could work ok that way without introducing much code >> >> > duplication. >> >> > >> >> > On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen >> >> > <pd_aficion...@hotmail.com>wrote: >> >> > >> >> > > >> >> > > Two longs is certainly cheaper than a string. Two longs take 128 bit >> >> and >> >> > > are stored in the main record of the PropertyContainer, while a String >> >> would >> >> > > require a 64 bit "pointer" in the main record of the >> >> PropertyContainer, and >> >> > > an additional read in the String store where the string representation >> >> will >> >> > > take up 256 bits. So both memory-wise, as perfomance wise, it is >> >> better to >> >> > > store a UUID as two long values. >> >> > > >> >> > > >> >> > > The main issue is something that needs a deeper fix than adding ID's. >> >> > > SortedTree now returns Nodes when traversing the tree. We should >> >> however >> >> > > return the KEY_VALUE Relationship to the indexed Node. Then >> >> > > IndexedRelationship.DirectRelationship can be created with that >> >> relationship >> >> > > as an argument. We get the Direction and the RelationshipType for >> >> free. >> >> > > Niels >> >> > > >> >> > > > Date: Thu, 8 Sep 2011 11:36:11 +1200 >> >> > > > From: bryc...@gmail.com >> >> > > > To: user@lists.neo4j.org >> >> > > > Subject: Re: [Neo4j] Issues with IndexedRelationship >> >> > > > >> >> > > > Hi Niels, >> >> > > > >> >> > > > Sorry I didn't quite write the bit about (1) clearly enough. The >> >> problem >> >> > > is >> >> > > > that it presently throws an Exception where it shouldn't. >> >> > > > >> >> > > > This stems from IndexedRelationship.DirectRelationship: >> >> > > > this.endRelationship = endNode.getSingleRelationship( >> >> > > > SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING ); >> >> > > > >> >> > > > So if the end node has more than one incoming KEY_VALUE relationship >> >> a >> >> > > more >> >> > > > than one relationship exception is thrown. >> >> > > > >> >> > > > Instead of the getSingleRelationship I was planning on iterating >> >> over the >> >> > > > relationships and matching the UUID stored at the root end of the IR >> >> with >> >> > > > one of the KEY_VALUE relationships (which is why using a unique id >> >> is >> >> > > > necessary rather than the relationship type). Note: there will >> >> actually >> >> > > > still be an issue if the same IR has multiple relationships to the >> >> same >> >> > > leaf >> >> > > > node - still thinking about that might need . >> >> > > > >> >> > > > Is storing the UUID as two longs much quicker than storing it as a >> >> > > string? >> >> > > > Curious about this since in my current model I have all the domain >> >> > > objects >> >> > > > with UUID's, and these are all stored as strings. If it was going >> >> to >> >> > > help >> >> > > > with either memory or performance then I would be keen to migrate >> >> this to >> >> > > > two longs. >> >> > > > >> >> > > > Cheers >> >> > > > Bryce >> >> > > > >> >> > > > On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen >> >> > > > <pd_aficion...@hotmail.com>wrote: >> >> > > > >> >> > > > > >> >> > > > > Great work Bryce, >> >> > > > > I do have a question though. >> >> > > > > What is the rationale for the restriction mentioned under "1)". Do >> >> you >> >> > > need >> >> > > > > this for the general case (to make IndexedRelationshipExpander >> >> work >> >> > > > > correctly), or do you need it for your own application to throw >> >> that >> >> > > > > exception? If the latter is the case, I think it would be >> >> important to >> >> > > tease >> >> > > > > out the general case and offer this new behaviour as an option. >> >> > > > > A unique key for the index is a good idea anyway and can be added >> >> to >> >> > > > > SortedTree. Generate a UUID and store it in two long properties. >> >> That >> >> > > way >> >> > > > > the two values will always be read in the first fetch of the >> >> underlying >> >> > > > > PropertyContainer. A getId method on the TreeNodes can then return >> >> a >> >> > > String >> >> > > > > representation of of the two long values. >> >> > > > > IndexRelationships are a relatively new development, so I think >> >> you are >> >> > > one >> >> > > > > of the first to actually try it out. Personally I have chosen to >> >> > > directly >> >> > > > > work with SortedTree, because I am working within the framework of >> >> a >> >> > > wrapper >> >> > > > > API, so I can integrate the functionality behind the regular >> >> > > > > createRelationshipTo and getRelationships methods. >> >> > > > > I don't think API changes will be an issue at the moment. >> >> > > > > Niels >> >> > > > > > Date: Thu, 8 Sep 2011 10:22:11 +1200 >> >> > > > > > From: bryc...@gmail.com >> >> > > > > > To: user@lists.neo4j.org >> >> > > > > > Subject: [Neo4j] Issues with IndexedRelationship >> >> > > > > > >> >> > > > > > Hi, >> >> > > > > > >> >> > > > > > As I mentioned a while ago I am looking at using >> >> > > IndexedRelationship's >> >> > > > > > within my application. The major thing that was missing for me >> >> to be >> >> > > > > able >> >> > > > > > to do this was IndexedRelationshipExpander being able to provide >> >> all >> >> > > the >> >> > > > > > relationships from the leaf end of indexed relationships through >> >> the >> >> > > the >> >> > > > > > root end. So I have been working on getting that support in >> >> there. >> >> > > > > > >> >> > > > > > However in writing this I have discovered a number of other >> >> issues >> >> > > that I >> >> > > > > > have also fixed, and at least one I am still working on. Since >> >> I was >> >> > > > > right >> >> > > > > > into the extra support for expanding the relationships it is >> >> hard to >> >> > > > > break >> >> > > > > > out these fixes as a separate commit (which I think would be >> >> ideal), >> >> > > so >> >> > > > > it >> >> > > > > > will most likely all come in together hopefully later today (NZ >> >> > > time). >> >> > > > > > >> >> > > > > > Just letting everyone know in case someone else is doing >> >> development >> >> > > > > against >> >> > > > > > indexed relationships. >> >> > > > > > >> >> > > > > > Quick run down of the issues, note: N -- IR(X) --> {A,B} below >> >> means >> >> > > > > there >> >> > > > > > is a indexed relationship from N to A & B, of type X. >> >> > > > > > >> >> > > > > > 1) Exception thrown when more than one IR terminates at a given >> >> node, >> >> > > > > e.g.: >> >> > > > > > N1 -- IR(X) --> {A,B,C,D} >> >> > > > > > N2 -- IR(X) --> {A,X,Y,Z} >> >> > > > > > Will throw an exception when using the >> >> IndexedRelationshipExpander on >> >> > > > > either >> >> > > > > > N1, or N2. >> >> > > > > > >> >> > > > > > 2) Start / End nodes are transposed when the IR has an direction >> >> of >> >> > > > > > incoming, i.e. the IR is created against N but across a set of >> >> > > incoming >> >> > > > > > relationships: >> >> > > > > > N <-- IR(Y) -- {A,B,C} >> >> > > > > > Will return 3 relationships N --> A, N --> B, N --> C. >> >> > > > > > >> >> > > > > > I have written tests for each of these, as well as a couple of >> >> other >> >> > > > > tests. >> >> > > > > > >> >> > > > > > Still completing (1) and have a little question about this. In >> >> order >> >> > > to >> >> > > > > fix >> >> > > > > > this I may need to introduce a unique ID stored against the IR >> >> both >> >> > > at >> >> > > > > the >> >> > > > > > root and at the leaves. Currently the relationship type is used >> >> to >> >> > > name >> >> > > > > the >> >> > > > > > IR at both root and leaves, but in the case above that means you >> >> > > can't >> >> > > > > tell >> >> > > > > > from node A which KEY_VALUE relationship belongs to which IR >> >> tree >> >> > > without >> >> > > > > > traversing the tree. >> >> > > > > > >> >> > > > > > So the question is adding this ID would mean that anyone who is >> >> > > already >> >> > > > > > using this wont have the ID, and therefore without care will be >> >> data >> >> > > > > > incompatible with the updated code. This could be managed via a >> >> > > check >> >> > > > > for >> >> > > > > > the ID when accessing the tree and if it isn't there doing a >> >> walk >> >> > > over >> >> > > > > the >> >> > > > > > tree to populate all the places where it is required. >> >> > > > > > >> >> > > > > > In general in developing against this code where do we sit on >> >> data >> >> > > > > > compatibility and API compatibility? >> >> > > > > > >> >> > > > > > Cheers >> >> > > > > > Bryce >> >> > > > > > _______________________________________________ >> >> > > > > > Neo4j mailing list >> >> > > > > > User@lists.neo4j.org >> >> > > > > > https://lists.neo4j.org/mailman/listinfo/user >> >> > > > > >> >> > > > > _______________________________________________ >> >> > > > > Neo4j mailing list >> >> > > > > User@lists.neo4j.org >> >> > > > > https://lists.neo4j.org/mailman/listinfo/user >> >> > > > > >> >> > > > _______________________________________________ >> >> > > > Neo4j mailing list >> >> > > > User@lists.neo4j.org >> >> > > > https://lists.neo4j.org/mailman/listinfo/user >> >> > > >> >> > > _______________________________________________ >> >> > > Neo4j mailing list >> >> > > User@lists.neo4j.org >> >> > > https://lists.neo4j.org/mailman/listinfo/user >> >> > > >> >> > _______________________________________________ >> >> > Neo4j mailing list >> >> > User@lists.neo4j.org >> >> > https://lists.neo4j.org/mailman/listinfo/user >> >> >> >> _______________________________________________ >> >> Neo4j mailing list >> >> User@lists.neo4j.org >> >> https://lists.neo4j.org/mailman/listinfo/user >> >> >> > >> > >> _______________________________________________ >> Neo4j mailing list >> User@lists.neo4j.org >> https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user