Re: [Neo4j] Enhanced API rewrite

Niels Hoogeveen Sat, 06 Aug 2011 15:53:25 -0700

Today I added fluency to the API design.

It is now possible to write:


Db().createVertex()
.setProperty(Name, "John")
.setProperty(Age, 29)
.addEdgeTo(june, WIFE)

I also added support for VertexTypes, which is nothing more and nothing less 
than a Vertex with a unique name and a class name to initialize the VertexType. 
Application programmers can decide for themselves how to implement VertexTypes.

VertexTypes can be retrieved from a Vertex with the method Vertex#getTypes(). 

There are no facilities to retrieve the Vertices defined with a certain 
VertexType. The connection between Vertex and VertexType is not stored as a 
Relationship, but is stored as a Long[] property on the Vertex, containing the 
id's of the VertexTypes, this to prevent the densely-connected-node-problem. 
Each Vertex will likely have few types, but each VertexType will likely have 
lots of associated Vertices. If users want to know know the Vertices of a 
VertexType they can create an index for that (something that is outside the 
scope of Enhanced API).

Edges all have at least one associated VertexType which is used for traversals. 
An Edge can have more than one VertexType, but only the one added as EdgeType 
(which extends VertexType) will be used for traversals.

Niels


> From: pd_aficion...@hotmail.com
> To: user@lists.neo4j.org
> Date: Sat, 6 Aug 2011 02:51:23 +0200
> Subject: [Neo4j] Enhanced API rewrite
> 
> 
> Today I pushed a major rewrite of the Enhanced API. See: 
> https://github.com/peterneubauer/graph-collections/tree/master/src/main/java/org/neo4j/collections/graphdb
> 
> Originally the Enhanced API was a drop-in replacement of the standard Neo4j 
> API. This resulted in lots of wrapper classes that needed to be maintained.
> 
> The rewrite of Enhanced API is no longer a drop-in replacement and contains 
> no interface/class names that can be found in the standard API.
> 
> Enhanced API no longer speaks of Nodes but of Vertices and doesn't speak of 
> Relationships but of Edges. This helps to prevent name clashes at the expense 
> of somewhat less recognizable names (Relationship is after all a more common 
> word than Edge). 
> 
> This rewrite is not merely a renaming of classes and interfaces, but is in 
> most part a complete rewrite and also a rethinking of the API on my part.
> 
> Enhanced API consists of two basic elements: Vertex and EdgeRole. Most 
> elements are a subclass of Vertex, though there are some specialized versions 
> of EdgeRole.
> 
> Let me start with an example:
> 
> Suppose we have two vertices denoting the persons Tom and Paula, and we want 
> to state that Tom is the father of Paula.
> 
> For standard Neo4j we tend to write such a fact as:
> 
> Tom --Father--> Paula
> 
> For Enhanced API we can conceptually write this fact as follows:
> 
>        --StartRole--Tom
> Father 
>        --EndRole--Paula
> 
> This should be read as follows: We have two Vertices: Tom and Paula and we 
> have a BinaryEdge (similar to a Relationship in the standard API) of type 
> "Father", where Tom has the StartRole for that edge and Paula has the EndRole 
> for that edge.
> 
> So instead of a directed graph, we conceptually have an undirected bipartite 
> graph.
> 
> For binary edges (edges between two vertices), this is mostly conceptually 
> the case, because the API will simply allow you to write: 
> tom.createEdgeTo(paula, FATHER) (similar to tom.createRelationshipTo(paula, 
> FATHER) as we would have in the standard API). 
> 
> It is also possible to fetch the start vertex of the binary relationship with 
> the method: edge.getStartVertex() (similar to relationship.getStartNode()), 
> although it is also possible to treat the binary edge as a generic edge and 
> fetch that Vertex as: edge.getElement(db.getStartRole()). 
> 
> BinaryEdges, are a special case and have special methods which cover the same 
> functionality as can be found in the standard Neo4j API.
> 
> In general, we can say that Vertices are connected to Edges by means of 
> EdgeRoles. In the binary case there are two predefined EdgeRoles: StartRole 
> and EndRole.
> 
> Before we get deeper into the general case of n-ary edges, let's first look 
> at another special case: Properties.
> 
> Properties can be thought of as unary edges, an edge that connects to only 
> one Vertex (as opposed to two in the binary case). 
> 
> Suppose we want to state that Tom is 49 years old, we can write that as:
> 
> age(49)--PropertyRole--Tom
> 
> We have an edge of type "age" that is connected to the vertex Tom in the role 
> of a property.
> 
> Again this is mostly conceptually true, because there are lots of methods in 
> Enhanced API that are very similar to the ones found in the standard API; 
> getProperty, hasProperty, setProperty. Instead, we can also call methods on 
> the property itself, after all the age property connected to the Vertex 
> "Tom", is an object all of itself. More precisely it is a Property and with 
> that it is a UnaryEdge, which is an Edge, which is a Vertex.
> 
> From the age property we can fetch the ProperyType, but we can also ask for 
> the Vertex it is connected to: getVertex(). Since a Property is an Edge we 
> can also fetch the connected vertex (Tom) as follows: 
> age.getElement(db.getPropertyRole).
> 
> So we have seen the two special cases: unary edges and binary edges, which 
> work very much the same as properties and Relationships in the standard Neo4j 
> API, though we have given it a conceptually different perspective that 
> unifies the two and fits it neatly into the general case of N-ary edges.
> 
> As said before, an Edge is a Vertex that connects other Vertices by means of 
> EdgeRoles. Since Edges are Vertices, they can have other Edges connected to 
> them. Or in standard API talk: relationships can be connected to other 
> relationships and they can have properties.
> 
> The concept of EdgeRoles separates Edges from Vertices, so we will 
> effectively have a bipartite graph where Vertices can only connect to Edges 
> and Edges can only connect to Vertices. Given the fact that Edges are also 
> Vertices, Edges can be connected to Edges, but in such a case it is 
> unambiguous which plays the role of Edge and which plays the role of Vertex 
> in that connection. 
> 
> Let's look at an example of an N-ary edge:
> 
> Suppose we want to state the fact that Tom gives Paula a Bicycle (no golden 
> helicopters in stock today). We can write that as follows:
> 
>       --Giver--Tom
> GIVES --Recipient -- Paula
>       --Gift -- Bicycle
> 
> There is an EdgeType GIVES which defines three EdgeRoles: Giver, Recipient 
> and Gift, which connect Tom, Paula and Bicycle to the Edge.
> 
> The edge is created by first creating three EdgeElement objects that each 
> contain a Role and the connected Vertex. We can then make the call 
> db.createEdge(GIVES, edgeElements).
> 
> An EdgeElement is that what is connected to Edge for a particular EdgeRole 
> (including that EdgeRole itself). 
> 
> An EdgeElement can contain more than one connected Vertex. We can for example 
> state: Tom and Dick give Paula a Bicycle. 
> 
> In Enhanced API notation:
> 
>       --Giver--Tom, Dick
> GIVES --Recipient -- Paula
>       --Gift -- Bicycle
> 
> Or we may want to state: Tom, Dick and Harry give Paula and Josephine a 
> Bicycle and an Icecream. 
> 
> In Enhanced API notation:
> 
>       --Giver--Tom, Dick, Harry
> GIVES --Recipient -- Paula, Josephine
>       --Gift -- Bicycle, Icecream
> 
> The API allow the user to fetch an EdgeElement by means of an EdgeRole and 
> iterate over the connected Vertices:
> 
> for(EdgeElement givers: gives.getElements(Giver)){
>   for(Vertex giver: givers.getVertices){
>      //do something with the giver Vertex
>   }
> }
> 
> For those cases where an EdgeElement can contain only one Vertex, there is a 
> FunctionalEdgeElement, which can only be used in conjunction with 
> FunctionalEdgeRoles. 
> 
> StartRole, EndRole and PropertyRole are all FunctionalEdgeRoles, since we can 
> have only one start Vertex and one end Vertex per BinaryEdge (just like there 
> can only be one StartNode and one EndNode for a Relationship in the standard 
> API) and we can only have one Vertex associated with a Property (just like a 
> property can not belong to two different Nodes in the standard Neo4j API) .
> 
> The Enhanced API can be used in conjunction with standard Neo4j API. The only 
> replacement needed is that of the database instance. The Enhanced API defines 
> a DatabaseService interface, which extends the standard GraphDatabaseService 
> interface and adds several enhanced methods for the creation and lookup of 
> Vertices, Edges and several kinds of VertexTypes.
> 
> Now the big question is of course, what do we gain with this entire apparatus?
> 
> First of all, we have unification of the storage elements of Neo4j. 
> Everything that can be stored in Neo4j is a Vertex:
> 
> Node is very much like a Vertex (with a slightly different interface that has 
> similar features to the standard Neo4j API, and more...)
> Relationship is very much like BinaryEdge, which is an Edge, which is a Vertex
> RelationshipType is covered by BinaryEdgeType which is an EdgeType, which is 
> a VertexType, which is a Vertex
> property name is wrapped as a PropertyType which is an an EdgeType, which is 
> a VertexType, which is a Vertex.
> propery value is wrapped as a Property which is a UnaryEdge, which is an 
> Edge, which is a Vertex
> 
> Having this unification, it is possible to write traversals to every part of 
> the Neo4j database. And that is the big boon of this unification.
> 
> Every part of the database can be accessed with a traveral description. 
> 
> The standard Neo4j API only allows traversals to return Nodes given a start 
> Node. The Enhanced API allows traversals from any part of the graph, whether 
> it is a regular Vertex, an Edge or a Property (or a type thereof), to any 
> other part of the graph, no matter if it is a regular Vertex, an Edge or a 
> Property (or a type thereof).
> 
> All that needs to be supplied are the EdgeTypes that need to be followed in a 
> traversal (and the regular evaluators that go with it).
> 
> Now the big downer to this all: 
> 
> I still have to write the traversal framework, which will actually follow the 
> Standard Neo4j framework, but will certainly make traversals composable.
> 
> Every Vertex is not just a Vertex, but it is also a bunch of paths. Well not 
> really a bunch, it is a bunch of size one, and not much of a path either, 
> since it only contains one path element, the Vertex itself.
> 
> A traversal returns a bunch of paths (Iterable<Path>) and starts from a bunch 
> of paths (still Iterable<Path>).
> 
> Since the output of a traversal is the same as the input of a traversal we 
> can now compose them. This makes it possible to write a traversal description 
> which states that we want to retrieve the parents of our friends, or the 
> neighbours of the parents of our friends, and even: the names of the dogs of 
> the neighbours of the parents of our friends (after all, we can now traverse 
> to a property). 
> 
> This can be achieved when we make traversal descriptions composable. Most 
> users probably don't want to manually compose traversals, they would much 
> rather compose traversal descriptions and let those descriptions do the 
> composition of the traversals. 
> 
> These are some things to work on over the weekend + plus + plus + 
> documentation (especially Javadoc) and more test cases (especially the 
> integration of IndexedRelationships as SortableBinaryEdges needs thorough 
> testing).
> 
> For the rest, I'd like to hear opinions and suggestions for improvement.
> 
> Niels                                           
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
                                          
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] Enhanced API rewrite

Reply via email to