Re: [Neo4j] Issues with IndexedRelationship

Bryce Wed, 07 Sep 2011 20:42:05 -0700

Another thought if there is going to be a larger refactor of the code is
whether the indexing mechanism should be broken out as a strategy for the
IndexedRelationship.  At present it is tied to SortedTree, but if an
interface was extracted out that had addNode, removeNode, iterator, and
isUniqueIndex then other indexing implementations could be used in certain
cases.


The particular other implementation I am currently thinking of that could be
of use to me would be a paged linked list.  So that would have a linked list
of pages, each with min < x < max KEY_VALUE (or equivalent) relationships.
 I think that could work quite well for the situation where the index is
descending date ordered, and generally just appended at the most recent end,
and results are retrieved in a paged manner generally from near the most
recent.

But more to the point there could be any number of implementations that
would be good for given different situations.

That does bring up a question though, there was some discussion a while ago
about some functionality along the lines of IndexedRelationship being pulled
into the core, so is that overkill for now if there is going to be another
core offering later?

On Thu, Sep 8, 2011 at 2:38 PM, Niels Hoogeveen
<pd_aficion...@hotmail.com>wrote:

>
> I think we don't have to worry about backwards compatibility much yet.
> There has not been a formal release of the component, so if there are people
> using the software, they will accept that they are bleeding edgers.
> Indeed addNode should return the KEY_VALUE relationship and I think we
> should change the signature of SortedTree to turn it into
> Iterable<Relationship>. No need to maintain a Node iterator, the node is
> always one getEndNode away.
> Niels
>
> > Date: Thu, 8 Sep 2011 14:17:59 +1200
> > From: bryc...@gmail.com
> > To: user@lists.neo4j.org
> > Subject: Re: [Neo4j] Issues with IndexedRelationship
> >
> > Will have to experiment with changing my id's to be stored as longs, it
> does
> > make perfect sense really that it would be better.  Thanks for the hint.
> >
> > In regards to SortedTree returning the KEY_VALUE relationship instead of
> the
> > end Node, I had thought of that too, and it would definitely help.  Could
> > end up being a significant change to SortedTree though, e.g.:
> >   sortedTree.addNode( node );
> > Would need to return the KEY_VALUE relationship instead of a boolean.
>  Which
> > not knowing where else SortedTree is used could be a large change?
> >
> > Maybe SortedTree would have two iterator's available a key_value
> > relationship iterator, and a node iterator.  Having a quick look at it
> now
> > it seems that it could work ok that way without introducing much code
> > duplication.
> >
> > On Thu, Sep 8, 2011 at 12:46 PM, Niels Hoogeveen
> > <pd_aficion...@hotmail.com>wrote:
> >
> > >
> > > Two longs is certainly cheaper than a string. Two longs take 128 bit
> and
> > > are stored in the main record of the PropertyContainer, while a String
> would
> > > require a 64 bit "pointer" in the main record of the PropertyContainer,
> and
> > > an additional read in the String store where the string representation
> will
> > > take up 256 bits. So both memory-wise, as perfomance wise, it is better
> to
> > > store a UUID as two long values.
> > >
> > >
> > > The main issue is something that needs a deeper fix than adding ID's.
> > > SortedTree now returns Nodes when traversing the tree. We should
> however
> > > return the KEY_VALUE Relationship to the indexed Node. Then
> > > IndexedRelationship.DirectRelationship can be created with that
> relationship
> > > as an argument. We get the Direction and the RelationshipType for free.
> > > Niels
> > >
> > > > Date: Thu, 8 Sep 2011 11:36:11 +1200
> > > > From: bryc...@gmail.com
> > > > To: user@lists.neo4j.org
> > > > Subject: Re: [Neo4j] Issues with IndexedRelationship
> > > >
> > > > Hi Niels,
> > > >
> > > > Sorry I didn't quite write the bit about (1) clearly enough.  The
> problem
> > > is
> > > > that it presently throws an Exception where it shouldn't.
> > > >
> > > > This stems from IndexedRelationship.DirectRelationship:
> > > > this.endRelationship = endNode.getSingleRelationship(
> > > > SortedTree.RelTypes.KEY_VALUE, Direction.INCOMING );
> > > >
> > > > So if the end node has more than one incoming KEY_VALUE relationship
> a
> > > more
> > > > than one relationship exception is thrown.
> > > >
> > > > Instead of the getSingleRelationship I was planning on iterating over
> the
> > > > relationships and matching the UUID stored at the root end of the IR
> with
> > > > one of the KEY_VALUE relationships (which is why using a unique id is
> > > > necessary rather than the relationship type).  Note: there will
> actually
> > > > still be an issue if the same IR has multiple relationships to the
> same
> > > leaf
> > > > node - still thinking about that might need .
> > > >
> > > > Is storing the UUID as two longs much quicker than storing it as a
> > > string?
> > > >  Curious about this since in my current model I have all the domain
> > > objects
> > > > with UUID's, and these are all stored as strings.  If it was going to
> > > help
> > > > with either memory or performance then I would be keen to migrate
> this to
> > > > two longs.
> > > >
> > > > Cheers
> > > > Bryce
> > > >
> > > > On Thu, Sep 8, 2011 at 11:07 AM, Niels Hoogeveen
> > > > <pd_aficion...@hotmail.com>wrote:
> > > >
> > > > >
> > > > > Great work Bryce,
> > > > > I do have a question though.
> > > > > What is the rationale for the restriction mentioned under "1)". Do
> you
> > > need
> > > > > this for the general case (to make IndexedRelationshipExpander work
> > > > > correctly), or do you need it for your own application to throw
> that
> > > > > exception? If the latter is the case, I think it would be important
> to
> > > tease
> > > > > out the general case and offer this new behaviour as an option.
> > > > > A unique key for the index is a good idea anyway and can be added
> to
> > > > > SortedTree. Generate a UUID and store it in two long properties.
> That
> > > way
> > > > > the two values will always be read in the first fetch of the
> underlying
> > > > > PropertyContainer. A getId method on the TreeNodes can then return
> a
> > > String
> > > > > representation of of the two long values.
> > > > > IndexRelationships are a relatively new development, so I think you
> are
> > > one
> > > > > of the first to actually try it out. Personally I have chosen to
> > > directly
> > > > > work with SortedTree, because I am working within the framework of
> a
> > > wrapper
> > > > > API, so I can integrate the functionality behind the regular
> > > > > createRelationshipTo and getRelationships methods.
> > > > > I don't think API changes will be an issue at the moment.
> > > > > Niels
> > > > > > Date: Thu, 8 Sep 2011 10:22:11 +1200
> > > > > > From: bryc...@gmail.com
> > > > > > To: user@lists.neo4j.org
> > > > > > Subject: [Neo4j] Issues with IndexedRelationship
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > As I mentioned a while ago I am looking at using
> > > IndexedRelationship's
> > > > > > within my application.  The major thing that was missing for me
> to be
> > > > > able
> > > > > > to do this was IndexedRelationshipExpander being able to provide
> all
> > > the
> > > > > > relationships from the leaf end of indexed relationships through
> the
> > > the
> > > > > > root end.  So I have been working on getting that support in
> there.
> > > > > >
> > > > > > However in writing this I have discovered a number of other
> issues
> > > that I
> > > > > > have also fixed, and at least one I am still working on.  Since I
> was
> > > > > right
> > > > > > into the extra support for expanding the relationships it is hard
> to
> > > > > break
> > > > > > out these fixes as a separate commit (which I think would be
> ideal),
> > > so
> > > > > it
> > > > > > will most likely all come in together hopefully later today (NZ
> > > time).
> > > > > >
> > > > > > Just letting everyone know in case someone else is doing
> development
> > > > > against
> > > > > > indexed relationships.
> > > > > >
> > > > > > Quick run down of the issues, note: N -- IR(X) --> {A,B} below
> means
> > > > > there
> > > > > > is a indexed relationship from N to A & B, of type X.
> > > > > >
> > > > > > 1) Exception thrown when more than one IR terminates at a given
> node,
> > > > > e.g.:
> > > > > > N1 -- IR(X) --> {A,B,C,D}
> > > > > > N2 -- IR(X) --> {A,X,Y,Z}
> > > > > > Will throw an exception when using the
> IndexedRelationshipExpander on
> > > > > either
> > > > > > N1, or N2.
> > > > > >
> > > > > > 2) Start / End nodes are transposed when the IR has an direction
> of
> > > > > > incoming, i.e. the IR is created against N but across a set of
> > > incoming
> > > > > > relationships:
> > > > > > N <-- IR(Y) -- {A,B,C}
> > > > > > Will return 3 relationships N --> A, N --> B, N --> C.
> > > > > >
> > > > > > I have written tests for each of these, as well as a couple of
> other
> > > > > tests.
> > > > > >
> > > > > > Still completing (1) and have a little question about this.  In
> order
> > > to
> > > > > fix
> > > > > > this I may need to introduce a unique ID stored against the IR
> both
> > > at
> > > > > the
> > > > > > root and at the leaves.  Currently the relationship type is used
> to
> > > name
> > > > > the
> > > > > > IR at both root and leaves, but in the case above that means you
> > > can't
> > > > > tell
> > > > > > from node A which KEY_VALUE relationship belongs to which IR tree
> > > without
> > > > > > traversing the tree.
> > > > > >
> > > > > > So the question is adding this ID would mean that anyone who is
> > > already
> > > > > > using this wont have the ID, and therefore without care will be
> data
> > > > > > incompatible with the updated code.  This could be managed via a
> > > check
> > > > > for
> > > > > > the ID when accessing the tree and if it isn't there doing a walk
> > > over
> > > > > the
> > > > > > tree to populate all the places where it is required.
> > > > > >
> > > > > > In general in developing against this code where do we sit on
> data
> > > > > > compatibility and API compatibility?
> > > > > >
> > > > > > Cheers
> > > > > > Bryce
> > > > > > _______________________________________________
> > > > > > Neo4j mailing list
> > > > > > User@lists.neo4j.org
> > > > > > https://lists.neo4j.org/mailman/listinfo/user
> > > > >
> > > > > _______________________________________________
> > > > > Neo4j mailing list
> > > > > User@lists.neo4j.org
> > > > > https://lists.neo4j.org/mailman/listinfo/user
> > > > >
> > > > _______________________________________________
> > > > Neo4j mailing list
> > > > User@lists.neo4j.org
> > > > https://lists.neo4j.org/mailman/listinfo/user
> > >
> > > _______________________________________________
> > > Neo4j mailing list
> > > User@lists.neo4j.org
> > > https://lists.neo4j.org/mailman/listinfo/user
> > >
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
>
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Issues with IndexedRelationship

Reply via email to