It seems a lot of projects have outgrown the current Traverser framework. People want to be able to do things such as visiting all relationships once instead of all nodes once (the current framework only visit each node once and leave out other relationships that go there). There has also been requests for a way to select the next relationship to traverse using a callback interface, instead of only having DEPTH_FIRST and BREADTH_FIRST to choose from.
To address these requests we have started a project to implement a new traversal framework. The early prototype for this is available in my laboratory directory in the repository: https://svn.neo4j.org/laboratory/users/tobias/traversal/ The state of the prototype is very early, it supports the same features as the current Traverser framework plus being able to limit the traversal by relationships instead of nodes or based on the path to each position. This is already quite interesting. The part about having a callback for selecting the next relationship is a later feature, one that will probably change the API of the prototype a lot. For comparison the tests contain an implementation of the old Traverser framework on top of the new Traversal framework. The limited number of tests we've written indicate the performance of the old framework is still slightly better than the new framework, but analysis of the operations the different implementations make on the graph gives me high hopes that the new framework will probably beat the old one after we've worked on it some more. Even with the slightly lower performance the versatility of the new framework is substantially greater. The prototype implements a TraversalDescription class that is used to construct the traversal, these objects are immutable and implement a builder pattern making it possible to create base TraversalDescription instances that can then be reused multiple times or adapted slightly before they are used. This leads to shorter, more readable code than the current traverse(...)-methods on the Node interface. This is also the part of the prototype that is most likely to change, both to accomodate for the features that are to be added, and being revised to make it easier to work with. Most of the new traversal framework will live in a new package: org.neo4j.graphdb.traversal, but there will be a few new things introduced in org.neo4j.graphdb as well. The new traversal framework will live side by side with the current traversal framework for a while, to ease the transition. The old traversal framework will be deprecated and then removed in some future version. The additions we have made in the prototype are: * A path representation. Defined as the interface org.neo4j.graphdb.Path. A Path object represents a path of interlinking relationships. The shortest possible path is the path with no relationships, a single node. * A concept of Uniqueness. This defines how often a node or relationship can be (re)visited. The current traversal framework implements what we have called "node global uniqueness" in the new framework. This enforces the restriction that a node cannot be visited more than once. Other uniqueness kinds we introduce in the new framework are: * Relationship global uniqueness - similar to node global uniqueness * Node/Relationship path uniqueness - each node (or relationship) may occur only once in the path (yes, using the Path interface from above) traversed to get there. * Node/Relationship recent uniqueness - a node (or relationship) may not be revisited if it is among the set of recently visited ones (the size of this set is configurable). This is similar to global uniqueness, but with a constraint on the amount of memory the traversal is allowed to use. * No uniqueness. The name says it all, all nodes and relationships are visited, without restriction. * PruneEvaluator - has the same role as the previous StopEvaluator, but with a more descriptive name. * RelationshipExpander - this is "what traversers are made of", given a node, a RelationshipExpander is responsible for getting the relationships from it that are relevant to the traversal. This is a utility that is used a lot in the new traversal framework, and that might also be useful on its own. The concepts we are planning to add, but have not completed the design for yet are: * A way to select/order relationships. This is an extension of DEPTH_FIRST and BREADTH_FIRST, where the user can define which relationships to traverse, and in what order to traverse them. * Relationship patterns, a way to tell the traversal to expand one relationship type first, and then another type, and so on. Try it out and give us feedback, we will keep working on the implementation, and hope to push this into the trunk of kernel as soon as possible. -- Tobias Ivarsson <tobias.ivars...@neotechnology.com> Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 _______________________________________________ Neo mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user