Yup, it's a pretty sane approach and somewhat along the lines of how I feel
it would be done. It's been said a long time that "this functionality will
be implemented some day" and it's just that a significant amount of time
have to be invested... maybe not for implementing it, but for discovering
all bugs and inconveniences to have it on par with production quality. And
that kind of time haven't been allocated yet.

I appreciate your thoughts and time on all this!

Best,
Mattias

2011/8/3 Niels Hoogeveen <pd_aficion...@hotmail.com>

>
> I would like to make a suggestion that would both address my feature
> request and increase performance of the database.
>
> Right now the NodeRecord (org.neo4j.kernel.impl.nioneo.store.NodeRecord)
> contains the ID of the first Relationship, while the RelationshipRecord
> contain the ID's of the previous and next relationship for both sides of the
> relationship.
>
> My suggestion is as follows:
>
> Create a new store:
>
> noderelationshiptypestore.db
>
> The layout of this store is given by the NodeRelationshipTypeRecord:
>
> id
> previousrelationshiptype
> nextrelationshiptype
> firstrelationship
>
> The NodeRecord would now need to point to the first outgoing
> NodeRelationshipType and to the first incoming NodeRelationshipType instead
> of to the first Relationship.
>
> On insert of a Relationship, one side of the relationship will update the
> store from the outgoing side, the other side will update the store for the
> incoming side.
>
> I will list the steps to take here for the outgoing side (the incoming side
> is almost identical).
>
> From the NodeRecord getFirstNodeRelationType (outgoing).
>
> Keep following NextRelationshipType until the desired record is found. If
> no record exists, create one, make the current FirstNodeRelationshipType in
> the NodeRecord (if it exists) the NextRelationshipType of the created
> NodeRelationshipType (and make the created one the previous of the current
> one) and make the created NodeRelationshipType the FirstNodeRelationshipType
> in the NodeRecord.
>
> In other words: find the NodeRelationshipTypeRecord in the linked list. If
> none exists, create a NodeRelationshipTypeRecord, prepend it to the existing
> list and change the entry point in the NodeRecord.
>
> We now have found the requested NodeRelationshipTypeRecord.
>
> From NodeRelationshipTypeRecord getFirstRelationship.
>
> Create a new RelationshipRecord and make it the FirstRelationship in the
> NodeRelationshipTypeRecord.
>
> Make the old first RelationshipRecord (if it exists) the nextRelationship
> of the new first RelationshipRecord and make the new first
> RelationshipRecord the previous of the old first RelationshipRecord.
>
> In other words: prepend a new RelationshipRecord to the existing list of
> Relationships and change the entry point in the NodeRelationshipTypeRecord.
>
> Do the same for the incoming side (except for the creation of the
> RelationshipRecord, we only need one of those).
>
> Instead of a linked list of Relationships per Node we now have two linked
> lists of RelationshipTypes per Node (one incoming, one outgoing), with a
> linked list of Relationships per NodeRelationshipType.
>
> With this approach only those Relationships need to be read that match the
> RelationshipType and Direction given.
>
> Worst case this approach leads to an extra read operation per
> RelationshipType:
>
> Worst case example 1: Retrieve all Relationships, regardless of
> Relationship or Direction. Here we have extra reads for all
> NodeRelationshipType records. If the number of Relationships per
> NodeRelationshipType is equal to 1, we have twice as many reads.
>
> Worst case example 2: If we retrieve Relationships by RelationshipType by
> Direction, we have at most one extra read to fetch the Relationship, and
> only in the case where the number of Relationships per NodeRelationshipType
> is equal to 1 and the requested RelationshipType happens to be the last in
> the list.
>
> On average partitioning Relationships the way suggested here improves
> performance, while the worst case situations only take a constant time hit
> of no more than a factor 2. On top of that the absolute worst case
> situation, where the performance is reduced by a factor 2 is likely to be
> uncommon in practical Neo4j applications.
>
> Certainly there will be situations where users want to iterate over all
> relationships of one particular node, which will still be very fast. It is
> much less likely that users would frequently want to iterate over all
> relationships of a huge number of nodes (where all those relationships
> happen to be functional as well). Most traversals will be much more
> discriminate and be based on certain particular RelationshipType where the
> performance penalty is at most one read operation per RelationshipType, but
> on average the number of reads will be lower, certainly if there is more
> than one Relationship associated to the NodeRelationshipType.The potential
> performance increase can be enormous, especially if we want to fetch a
> relationship of a RelationshipType that has few associated Relationships,
> while some other NodeRelationshipType has a huge number of Reltionships.Best
> case scenario: A node has one Relationship of type A and 1 million
> Relationships of type B and we want to fetch t
>  hat one Relationship of type A. In the current situation we have to do
> 1,000,001 reads to fetch that one Relationship. With this proposal this
> would be reduced to at most 3 reads for that particular request.
> Altogether this proposal has advantages in most practical situations. In
> some corner cases the performance may decrease by at most a factor 2, while
> the performance may increase by orders of magnitude in some quite common use
> cases.
> On top of that, we can also present the meta information I requested,
> because we can simply iterate over the NodeRelationshipType list and return
> the entries to the user.
>
> Finally, this proposal makes it possible to guarantee functional,
> surjective and one-to-one Relationships. Due to the partitioning we will
> know if there already is a relationship of a certain type. If a relationship
> is stated to be functional, surjective, or one-to-one, we can raise an
> exception when a second relationship is about to be created for that
> particular NodeRelationshipType.
>
> Niels
>
>
>
> > From: pd_aficion...@hotmail.com
> > To: user@lists.neo4j.org
> > Date: Tue, 2 Aug 2011 23:03:41 +0200
> > Subject: Re: [Neo4j] Node#getRelationshipTypes
> >
> >
> > Building an API on top of Neo4j of course pushes the standard API to its
> limits. So for that matter it is already a good exercise.
> > Any chance this feature request will find its way into 1.5?
> > Niels
> >
> > > Date: Tue, 2 Aug 2011 22:33:03 +0200
> > > From: matt...@neotechnology.com
> > > To: user@lists.neo4j.org
> > > Subject: Re: [Neo4j] Node#getRelationshipTypes
> > >
> > > Those methods will of course be more efficient if implemented in the
> kernel
> > > compared to iterating through all relationships if the whole
> relationship
> > > chain have already been loaded for that node, otherwise it will require
> a
> > > full iteration (or at least making sure the whole chain have been
> loaded).
> > > I've never found a use case for it myself and this is the first I've
> heard.
> > >
> > > 2011/8/1 Niels Hoogeveen <pd_aficion...@hotmail.com>
> > >
> > > >
> > > > I have two specific use cases for these methods:
> > > > I'd like to present a node with the property types (names) it has
> content
> > > > for and with the relationship types it has relationships for, while
> loading
> > > > those properties/relationships on demand (ie. click here to see
> details).
> > > > This can be done for properties: there is a getPropertyKeys() method,
> but
> > > > there is no getRelationshipTypes() method.
> > > > The other use case has to do with the Enhanced API. There I want to
> have
> > > > pluggable relationships and properties. With respect to relationships
> there
> > > > are already three implementations: the regular Relationship,
> SortedRelations
> > > > (which use an in-graph Btree for storage) and HyperRelationships
> which allow
> > > > n-ary relationships.
> > > > Every Element in Enhanced API has a getRelationships() method, much
> like
> > > > the getRelationships() method in Node, which should return every
> > > > relationship attached to an Element, irrespective of its
> implementation.
> > > > Right now the Element implementation has to perform the logic to
> distinguish
> > > > which relationship is used for what implementation (under the hood it
> all
> > > > works using normal Relationships). It would be much more elegant to
> iterate
> > > > over the RelationshipTypes and dispatch the getRelationships() method
> to the
> > > > appropriate RelationshipType implementations. That way the logic for
> > > > SortedRelationships, HyperRelationships remains in their associated
> classes
> > > > and is not spread around the implementation.
> > > >
> > > > Niels
> > > > > From: michael.hun...@neotechnology.com
> > > > > Date: Sun, 31 Jul 2011 23:20:50 +0200
> > > > > To: user@lists.neo4j.org
> > > > > Subject: Re: [Neo4j] Node#getRelationshipTypes
> > > > >
> > > > > Imho it would have to iterate as well.
> > > > >
> > > > > As the type is stored with the relationship record and so can only
> be
> > > > accessed after having read it.
> > > > >
> > > > > It might be to have some minimal performance improvements that
> > > > relationships would not have to be fully loaded, nor put into the
> cache for
> > > > that. But this is always a question of the use-case. What will be
> done next
> > > > with those rel-types.
> > > > >
> > > > > What was the use-case for this operation again?
> > > > >
> > > > > Cheers
> > > > >
> > > > > Michael
> > > > >
> > > > > Am 31.07.2011 um 18:59 schrieb Niels Hoogeveen:
> > > > >
> > > > > >
> > > > > > Good point.
> > > > > > It could for all practical purposes even be
> Iterable<RelationshipType>
> > > > so they can be lazily fetched, as long as the underlying
> implementation
> > > > makes certain that any iteration of the RelationshipTypes forms a set
> (no
> > > > duplicates).
> > > > > > There is no need to have RelationshipTypes in any particular
> order, and
> > > > if that is needed in the application, they can usually be sorted
> locally
> > > > since Nodes will generally have associated Relationships of only a
> handful
> > > > of RelationshipTypes.
> > > > > >
> > > > > > That said, the more important question is, if the Neo4j store can
> > > > produce this meta-information. For sparsely connected nodes, it is
> possible
> > > > to iterate over the relationships and return the set of
> RelationshipTypes,
> > > > but this is not a proper solution when nodes are densely connected.
> So there
> > > > is no general solution for this question yet.
> > > > > > Niels
> > > > > >
> > > > > >> From: j...@neotechnology.com
> > > > > >> Date: Sun, 31 Jul 2011 17:29:29 +0100
> > > > > >> To: user@lists.neo4j.org
> > > > > >> Subject: Re: [Neo4j] Node#getRelationshipTypes
> > > > > >>
> > > > > >> Hi Niels,
> > > > > >>
> > > > > >> Ignoring the operational use for getting relationship types, I
> do
> > > > think these should be generalised from:
> > > > > >>
> > > > > >>> RelationshipType[] getRelationshipTypes();
> > > > > >>> RelationshipType[] getRelationshipTypes(Direction);
> > > > > >>
> > > > > >> to:
> > > > > >>
> > > > > >> Set<RelationshipType> getRelationshipTypes();
> > > > > >> Set<RelationshipType> getgetRelationshipTypes(Direction);
> > > > > >>
> > > > > >> Unless you need the ordering and you think the overhead of
> creating a
> > > > some kind of Set is too onerous from a performance point of view.
> > > > > >>
> > > > > >> Jim
> > > > > >> _______________________________________________
> > > > > >> Neo4j mailing list
> > > > > >> User@lists.neo4j.org
> > > > > >> https://lists.neo4j.org/mailman/listinfo/user
> > > > > >
> > > > > > _______________________________________________
> > > > > > Neo4j mailing list
> > > > > > User@lists.neo4j.org
> > > > > > https://lists.neo4j.org/mailman/listinfo/user
> > > > >
> > > > > _______________________________________________
> > > > > Neo4j mailing list
> > > > > User@lists.neo4j.org
> > > > > https://lists.neo4j.org/mailman/listinfo/user
> > > >
> > > > _______________________________________________
> > > > Neo4j mailing list
> > > > User@lists.neo4j.org
> > > > https://lists.neo4j.org/mailman/listinfo/user
> > > >
> > >
> > >
> > >
> > > --
> > > Mattias Persson, [matt...@neotechnology.com]
> > > Hacker, Neo Technology
> > > www.neotechnology.com
> > > _______________________________________________
> > > Neo4j mailing list
> > > User@lists.neo4j.org
> > > https://lists.neo4j.org/mailman/listinfo/user
> >
> > _______________________________________________
> > Neo4j mailing list
> > User@lists.neo4j.org
> > https://lists.neo4j.org/mailman/listinfo/user
>
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to