I can probably find the time for that. It would be fun working on these ideas in collaboration. I don't mind producing my usual brain-dumps and write some of the code, but quality will certainly improve when it is more than just me paying attention to this. Niels
> From: peter.neuba...@neotechnology.com > Date: Mon, 8 Aug 2011 11:50:35 +0200 > To: user@lists.neo4j.org > Subject: Re: [Neo4j] Enhanced API rewrite > > Very interesting thoughts! > > I would love to have a bootcamp and explore a spike on how this would > work out in practice. Got anything to do this autumn? ;) > > Cheers, > > /peter neubauer > > GTalk: neubauer.peter > Skype peter.neubauer > Phone +46 704 106975 > LinkedIn http://www.linkedin.com/in/neubauer > Twitter http://twitter.com/peterneubauer > > http://www.neo4j.org - Your high performance graph database. > http://startupbootcamp.org/ - Ă–resund - Innovation happens HERE. > http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. > > > > On Sun, Aug 7, 2011 at 4:30 PM, Niels Hoogeveen > <pd_aficion...@hotmail.com> wrote: > > > > Hi Peter, > > > > Thanks for showing an interest. > > > > A Property is indeed a unary edge in the Enhanced API and therefore > > (potentially) backed by a Node, but that Node doesn't contain the value. > > > > All property values are still stored the way they are stored in the > > standard API. If someone however decides to add a Property to a Property or > > create an Edge containing that Property, a Node will be created to store > > those properties and connect those Edges to. > > > > When the associated Node of a Property is created, the ID of that Node will > > be stored in the PropertyContainer of that property. > > > > Example: > > > > Suppose we have a property on a "Person" Vertex that denotes a personal > > identity number, and the user of the application want to annually check > > that identity number against some other database and state when it was last > > verified and who verified it. > > > > A Vertex (backed by a Node) for a particular Person is created and the > > property is set (in that Node's PropertyContainer), just like it would be > > the case in the standard API. > > > > When the verification is done, an additional property is created on the > > PropertyContainer of that Person with the name > > org.neo4j.collections.graphdb.[propertyname].node_id > > > > This property contains the node ID of the associated property. On that node > > the verification date will be set and the BinaryEdge (in principle nothing > > but a classic Relationship) will be created to the "Person" Vertex of the > > one who verified the personal identity code. > > > > It is certainly true that everything being a Vertex makes the Node > > implementation more important than ever before, but it goes even further, > > apart from a standard Vertex and the various VertexTypes, almost everything > > is an Edge. So I would say the Relationship implementation is becoming > > eminently important. > > > > There are certainly several tweaks to the storage layer I would love to see > > incorporated, mostly to hide the implementation for the user and to make > > sure that the maintenance of IDs takes place in core and not in a layer on > > top of core. > > > > In fact all of Enhanced API could much better be maintained in core, > > something that can actually quite easily be implemented. One of my > > "ulterior motives" with the development of Enhanced API is to tease out the > > technical requirements to push this functionality into core (whether Neo > > Tech decides to do so, is another question of course). > > > > Since the Neo4j database consists mostly of records and linked lists, the > > technical requirements to push things into core, are mostly a question of > > adding entry-points to linked lists in some records and partitioning some > > existing linked lists. > > > > I will write down those requirements in a separate post. This will include > > support for N-ary edges, since that is actually not all that difficult to > > implement and adds very little complexity to the database. > > > > Yes, traversals will become much more generalized in the Enhanced API, > > especially when we make them composable. In fact composable traversal > > descriptions can easily be seen as a query language giving access to all > > parts of the database. > > > > Niels > > > >> From: peter.neuba...@neotechnology.com > >> Date: Sun, 7 Aug 2011 09:10:02 +0200 > >> To: user@lists.neo4j.org > >> Subject: Re: [Neo4j] Enhanced API rewrite > >> > >> Niels, > >> this sounds very interesting. Given the role of properties being unary > >> edges, that would mean that any classic Neo4j property would now be a > >> Node with one Property in the new Vertex sense? > >> > >> Having Vertices for EVERYTHING will of course make the > >> node-implementation much more important than anything else, since > >> every element is backed by a node, possibly with some property. I > >> wonder how this would reflect in the storage layer that might need to > >> be tweaked. > >> > >> Also, as you point out, traversals will become quite different with > >> this API, but let's see an what the weekend brings ;) > >> > >> Cheers, > >> > >> /peter neubauer > >> > >> GTalk: neubauer.peter > >> Skype peter.neubauer > >> Phone +46 704 106975 > >> LinkedIn http://www.linkedin.com/in/neubauer > >> Twitter http://twitter.com/peterneubauer > >> > >> http://www.neo4j.org - Your high performance graph database. > >> http://startupbootcamp.org/ - Ă–resund - Innovation happens HERE. > >> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. > >> > >> > >> > >> On Sat, Aug 6, 2011 at 2:51 AM, Niels Hoogeveen > >> <pd_aficion...@hotmail.com> wrote: > >> > > >> > Today I pushed a major rewrite of the Enhanced API. See: > >> > https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdb > >> > > >> > Originally the Enhanced API was a drop-in replacement of the standard > >> > Neo4j API. This resulted in lots of wrapper classes that needed to be > >> > maintained. > >> > > >> > The rewrite of Enhanced API is no longer a drop-in replacement and > >> > contains no interface/class names that can be found in the standard API. > >> > > >> > Enhanced API no longer speaks of Nodes but of Vertices and doesn't speak > >> > of Relationships but of Edges. This helps to prevent name clashes at the > >> > expense of somewhat less recognizable names (Relationship is after all a > >> > more common word than Edge). > >> > > >> > This rewrite is not merely a renaming of classes and interfaces, but is > >> > in most part a complete rewrite and also a rethinking of the API on my > >> > part. > >> > > >> > Enhanced API consists of two basic elements: Vertex and EdgeRole. Most > >> > elements are a subclass of Vertex, though there are some specialized > >> > versions of EdgeRole. > >> > > >> > Let me start with an example: > >> > > >> > Suppose we have two vertices denoting the persons Tom and Paula, and we > >> > want to state that Tom is the father of Paula. > >> > > >> > For standard Neo4j we tend to write such a fact as: > >> > > >> > Tom --Father--> Paula > >> > > >> > For Enhanced API we can conceptually write this fact as follows: > >> > > >> > --StartRole--Tom > >> > Father > >> > --EndRole--Paula > >> > > >> > This should be read as follows: We have two Vertices: Tom and Paula and > >> > we have a BinaryEdge (similar to a Relationship in the standard API) of > >> > type "Father", where Tom has the StartRole for that edge and Paula has > >> > the EndRole for that edge. > >> > > >> > So instead of a directed graph, we conceptually have an undirected > >> > bipartite graph. > >> > > >> > For binary edges (edges between two vertices), this is mostly > >> > conceptually the case, because the API will simply allow you to write: > >> > tom.createEdgeTo(paula, FATHER) (similar to > >> > tom.createRelationshipTo(paula, FATHER) as we would have in the standard > >> > API). > >> > > >> > It is also possible to fetch the start vertex of the binary relationship > >> > with the method: edge.getStartVertex() (similar to > >> > relationship.getStartNode()), although it is also possible to treat the > >> > binary edge as a generic edge and fetch that Vertex as: > >> > edge.getElement(db.getStartRole()). > >> > > >> > BinaryEdges, are a special case and have special methods which cover the > >> > same functionality as can be found in the standard Neo4j API. > >> > > >> > In general, we can say that Vertices are connected to Edges by means of > >> > EdgeRoles. In the binary case there are two predefined EdgeRoles: > >> > StartRole and EndRole. > >> > > >> > Before we get deeper into the general case of n-ary edges, let's first > >> > look at another special case: Properties. > >> > > >> > Properties can be thought of as unary edges, an edge that connects to > >> > only one Vertex (as opposed to two in the binary case). > >> > > >> > Suppose we want to state that Tom is 49 years old, we can write that as: > >> > > >> > age(49)--PropertyRole--Tom > >> > > >> > We have an edge of type "age" that is connected to the vertex Tom in the > >> > role of a property. > >> > > >> > Again this is mostly conceptually true, because there are lots of > >> > methods in Enhanced API that are very similar to the ones found in the > >> > standard API; getProperty, hasProperty, setProperty. Instead, we can > >> > also call methods on the property itself, after all the age property > >> > connected to the Vertex "Tom", is an object all of itself. More > >> > precisely it is a Property and with that it is a UnaryEdge, which is an > >> > Edge, which is a Vertex. > >> > > >> > From the age property we can fetch the ProperyType, but we can also ask > >> > for the Vertex it is connected to: getVertex(). Since a Property is an > >> > Edge we can also fetch the connected vertex (Tom) as follows: > >> > age.getElement(db.getPropertyRole). > >> > > >> > So we have seen the two special cases: unary edges and binary edges, > >> > which work very much the same as properties and Relationships in the > >> > standard Neo4j API, though we have given it a conceptually different > >> > perspective that unifies the two and fits it neatly into the general > >> > case of N-ary edges. > >> > > >> > As said before, an Edge is a Vertex that connects other Vertices by > >> > means of EdgeRoles. Since Edges are Vertices, they can have other Edges > >> > connected to them. Or in standard API talk: relationships can be > >> > connected to other relationships and they can have properties. > >> > > >> > The concept of EdgeRoles separates Edges from Vertices, so we will > >> > effectively have a bipartite graph where Vertices can only connect to > >> > Edges and Edges can only connect to Vertices. Given the fact that Edges > >> > are also Vertices, Edges can be connected to Edges, but in such a case > >> > it is unambiguous which plays the role of Edge and which plays the role > >> > of Vertex in that connection. > >> > > >> > Let's look at an example of an N-ary edge: > >> > > >> > Suppose we want to state the fact that Tom gives Paula a Bicycle (no > >> > golden helicopters in stock today). We can write that as follows: > >> > > >> > --Giver--Tom > >> > GIVES --Recipient -- Paula > >> > --Gift -- Bicycle > >> > > >> > There is an EdgeType GIVES which defines three EdgeRoles: Giver, > >> > Recipient and Gift, which connect Tom, Paula and Bicycle to the Edge. > >> > > >> > The edge is created by first creating three EdgeElement objects that > >> > each contain a Role and the connected Vertex. We can then make the call > >> > db.createEdge(GIVES, edgeElements). > >> > > >> > An EdgeElement is that what is connected to Edge for a particular > >> > EdgeRole (including that EdgeRole itself). > >> > > >> > An EdgeElement can contain more than one connected Vertex. We can for > >> > example state: Tom and Dick give Paula a Bicycle. > >> > > >> > In Enhanced API notation: > >> > > >> > --Giver--Tom, Dick > >> > GIVES --Recipient -- Paula > >> > --Gift -- Bicycle > >> > > >> > Or we may want to state: Tom, Dick and Harry give Paula and Josephine a > >> > Bicycle and an Icecream. > >> > > >> > In Enhanced API notation: > >> > > >> > --Giver--Tom, Dick, Harry > >> > GIVES --Recipient -- Paula, Josephine > >> > --Gift -- Bicycle, Icecream > >> > > >> > The API allow the user to fetch an EdgeElement by means of an EdgeRole > >> > and iterate over the connected Vertices: > >> > > >> > for(EdgeElement givers: gives.getElements(Giver)){ > >> > for(Vertex giver: givers.getVertices){ > >> > //do something with the giver Vertex > >> > } > >> > } > >> > > >> > For those cases where an EdgeElement can contain only one Vertex, there > >> > is a FunctionalEdgeElement, which can only be used in conjunction with > >> > FunctionalEdgeRoles. > >> > > >> > StartRole, EndRole and PropertyRole are all FunctionalEdgeRoles, since > >> > we can have only one start Vertex and one end Vertex per BinaryEdge > >> > (just like there can only be one StartNode and one EndNode for a > >> > Relationship in the standard API) and we can only have one Vertex > >> > associated with a Property (just like a property can not belong to two > >> > different Nodes in the standard Neo4j API) . > >> > > >> > The Enhanced API can be used in conjunction with standard Neo4j API. The > >> > only replacement needed is that of the database instance. The Enhanced > >> > API defines a DatabaseService interface, which extends the standard > >> > GraphDatabaseService interface and adds several enhanced methods for the > >> > creation and lookup of Vertices, Edges and several kinds of VertexTypes. > >> > > >> > Now the big question is of course, what do we gain with this entire > >> > apparatus? > >> > > >> > First of all, we have unification of the storage elements of Neo4j. > >> > Everything that can be stored in Neo4j is a Vertex: > >> > > >> > Node is very much like a Vertex (with a slightly different interface > >> > that has similar features to the standard Neo4j API, and more...) > >> > Relationship is very much like BinaryEdge, which is an Edge, which is a > >> > Vertex > >> > RelationshipType is covered by BinaryEdgeType which is an EdgeType, > >> > which is a VertexType, which is a Vertex > >> > property name is wrapped as a PropertyType which is an an EdgeType, > >> > which is a VertexType, which is a Vertex. > >> > propery value is wrapped as a Property which is a UnaryEdge, which is an > >> > Edge, which is a Vertex > >> > > >> > Having this unification, it is possible to write traversals to every > >> > part of the Neo4j database. And that is the big boon of this unification. > >> > > >> > Every part of the database can be accessed with a traveral description. > >> > > >> > The standard Neo4j API only allows traversals to return Nodes given a > >> > start Node. The Enhanced API allows traversals from any part of the > >> > graph, whether it is a regular Vertex, an Edge or a Property (or a type > >> > thereof), to any other part of the graph, no matter if it is a regular > >> > Vertex, an Edge or a Property (or a type thereof). > >> > > >> > All that needs to be supplied are the EdgeTypes that need to be followed > >> > in a traversal (and the regular evaluators that go with it). > >> > > >> > Now the big downer to this all: > >> > > >> > I still have to write the traversal framework, which will actually > >> > follow the Standard Neo4j framework, but will certainly make traversals > >> > composable. > >> > > >> > Every Vertex is not just a Vertex, but it is also a bunch of paths. Well > >> > not really a bunch, it is a bunch of size one, and not much of a path > >> > either, since it only contains one path element, the Vertex itself. > >> > > >> > A traversal returns a bunch of paths (Iterable<Path>) and starts from a > >> > bunch of paths (still Iterable<Path>). > >> > > >> > Since the output of a traversal is the same as the input of a traversal > >> > we can now compose them. This makes it possible to write a traversal > >> > description which states that we want to retrieve the parents of our > >> > friends, or the neighbours of the parents of our friends, and even: the > >> > names of the dogs of the neighbours of the parents of our friends (after > >> > all, we can now traverse to a property). > >> > > >> > This can be achieved when we make traversal descriptions composable. > >> > Most users probably don't want to manually compose traversals, they > >> > would much rather compose traversal descriptions and let those > >> > descriptions do the composition of the traversals. > >> > > >> > These are some things to work on over the weekend + plus + plus + > >> > documentation (especially Javadoc) and more test cases (especially the > >> > integration of IndexedRelationships as SortableBinaryEdges needs > >> > thorough testing). > >> > > >> > For the rest, I'd like to hear opinions and suggestions for improvement. > >> > > >> > Niels > >> > _______________________________________________ > >> > Neo4j mailing list > >> > User@lists.neo4j.org > >> > https://lists.neo4j.org/mailman/listinfo/user > >> > > >> _______________________________________________ > >> Neo4j mailing list > >> User@lists.neo4j.org > >> https://lists.neo4j.org/mailman/listinfo/user > > > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user