Hi All,

I'm new to neo4j and the mailing list. I have a simple traversal use
case that I couldn't find a way to perform with the existing API. What
would be your suggestion to the following scenario?

I want to do a depth first traversal of the graph and want to add a
property called "post_order_value" to each node during the traversal.
For example, if we have a graph of
A -> B , B -> C , A->D , D->C

An example traverser labels the nodes with post-order values as C=1,
B=2,D=3 and A=4. Because the depth first traversal first finishes C,
then B, then D and finally A.

What I am concerned is a slight variation of this problem. In the
above example, the
visitation order of the children is important. For example, if the
traverser had chosen D before B while processing the children of A,
the labels would end up being C=1, D=2,B=3 and A=4. I'd like to
generate random traversals, so that in the DFS the traverser chooses
the children in random order. By this way I will get many valid
post-order labelings.

Is this possible in the current traversal API or in the Improved framework?

Thanks,


p.s. : This email is indeed a reply to Tobias' "A new and improved
Traversal framework" message.


[Neo] A new and improved Traversal framework
Tobias Ivarsson
Tue, 30 Mar 2010 06:48:46 -0700

It seems a lot of projects have outgrown the current Traverser framework.
People want to be able to do things such as visiting all relationships once
instead of all nodes once (the current framework only visit each node once
and leave out other relationships that go there). There has also been
requests for a way to select the next relationship to traverse using a
callback interface, instead of only having DEPTH_FIRST and BREADTH_FIRST to
choose from.

To address these requests we have started a project to implement a new
traversal framework. The early prototype for this is available in my
laboratory directory in the repository:
https://svn.neo4j.org/laboratory/users/tobias/traversal/
The state of the prototype is very early, it supports the same features as
the current Traverser framework plus being able to limit the traversal by
relationships instead of nodes or based on the path to each position. This
is already quite interesting. The part about having a callback for selecting
the next relationship is a later feature, one that will probably change the
API of the prototype a lot. For comparison the tests contain an
implementation of the old Traverser framework on top of the new Traversal
framework. The limited number of tests we've written indicate the
performance of the old framework is still slightly better than the new
framework, but analysis of the operations the different implementations make
on the graph gives me high hopes that the new framework will probably beat
the old one after we've worked on it some more. Even with the slightly lower
performance the versatility of the new framework is substantially greater.

The prototype implements a TraversalDescription class that is used to
construct the traversal, these objects are immutable and implement a builder
pattern making it possible to create base TraversalDescription instances
that can then be reused multiple times or adapted slightly before they are
used. This leads to shorter, more readable code than the current
traverse(...)-methods on the Node interface. This is also the part of the
prototype that is most likely to change, both to accomodate for the features
that are to be added, and being revised to make it easier to work with.

Most of the new traversal framework will live in a new
package: org.neo4j.graphdb.traversal, but there will be a few new things
introduced in org.neo4j.graphdb as well. The new traversal framework will
live side by side with the current traversal framework for a while, to ease
the transition. The old traversal framework will be deprecated and then
removed in some future version.


The additions we have made in the prototype are:

* A path representation. Defined as the interface org.neo4j.graphdb.Path. A
Path object represents a path of interlinking relationships. The shortest
possible path is the path with no relationships, a single node.

* A concept of Uniqueness. This defines how often a node or relationship can
be (re)visited. The current traversal framework implements what we have
called "node global uniqueness" in the new framework. This enforces the
restriction that a node cannot be visited more than once. Other uniqueness
kinds we introduce in the new framework are:
  * Relationship global uniqueness - similar to node global uniqueness
  * Node/Relationship path uniqueness - each node (or relationship) may
occur only once in the path (yes, using the Path interface from above)
traversed to get there.
  * Node/Relationship recent uniqueness - a node (or relationship) may not
be revisited if it is among the set of recently visited ones (the size of
this set is configurable). This is similar to global uniqueness, but with a
constraint on the amount of memory the traversal is allowed to use.
  * No uniqueness. The name says it all, all nodes and relationships are
visited, without restriction.

* PruneEvaluator - has the same role as the previous StopEvaluator, but with
a more descriptive name.

* RelationshipExpander - this is "what traversers are made of", given a
node, a RelationshipExpander is responsible for getting the relationships
from it that are relevant to the traversal. This is a utility that is used a
lot in the new traversal framework, and that might also be useful on its
own.


The concepts we are planning to add, but have not completed the design for
yet are:

* A way to select/order relationships. This is an extension of DEPTH_FIRST
and BREADTH_FIRST, where the user can define which relationships to
traverse, and in what order to traverse them.

* Relationship patterns, a way to tell the traversal to expand one
relationship type first, and then another type, and so on.


Try it out and give us feedback, we will keep working on the implementation,
and hope to push this into the trunk of kernel as soon as possible.

-- 
Tobias Ivarsson <tobias.ivars...@neotechnology.com>
Hacker, Neo Technology
www.neotechnology.com
Cellphone: +46 706 534857
_______________________________________________
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
_______________________________________________
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to