Hi,

> 4) Some important features are absent from the API. For social networks,
>> it's often very important to be able to keep edges ordered by a particular
>> property key (say "creationDate") and read edges in a particular
>> time-window (say between now and 7 days ago) or with a particular limit
>> (say most recent edges up to 10k edges). TinkerPop currently doesn't seem
>> to support those features. Instead the application has to first read *all*
>> the edges from a vertex and filter them. If the number of edges is very
>> large and the storage of those edges is remote to the client, this can be a
>> very costly operation.
>> 
> 
> I don't see what you describe as being "absent". The use cases you describe
> are present in Gremlin right?
> 
> g.V().outE().has('createdDate", gt(sevenDaysAgo)).inV()
> 
> If you have a way to improve the speed of that query (like Titan does for
> instance with vertex centric indices), then you just need to write a
> TraversalStrategy for your graph implementation that inspects that Gremlin
> on execution and recompiles it to a more efficient form for execution
> against your graph. TraversalStrategies are the method by which you can
> really showcase the power of your graph implementation as compared to
> others.




I would like to extend on this. Note that the raw 
Graph/Vertex/Edge/Property/etc. APIs are the bare minimum to get a Graph 
implementation being TInkerPop-enabled. These APIs are very Iterator-based and 
as such, don’t lend themselves to intelligent “push down predicate”-type of 
data access patterns. This is like this for a reason. These APIs are meant to 
be what any reasonable implementation can provide.

However, graph providers will most definitely want to move beyond these simple 
methods to ensure that the features of their graph system are taken full 
advantage of. As Stephen notes, this is all done via TraversalStrategies. Users 
should NOT be interacting with the Java API, but via Traversals and 
TraversalStrategies are the compilations rules that can allow a provider to do 
stuff like:

        1. Index lookups.
                - g.V().has(“name”,”marko”)  // index call to get the 
name=marko vertex
        2. Push down predicates for intelligent selection of data subsets.
                - g.V().hasLabel(“person”).out(“knows”).hasLabel(“person”) // 
only access the person-partition of the graph
        3. Grouping steps into “higher order” steps that do more with less 
calls.
                - g.V().outE().has(“timestamp”,gt(2010)).inV() // do 
outE().has().inV() as one mega-step that grabs only the needed data
        4. …

Each provider will have their own set of Strategies and custom steps that 
compile Gremlin (beyond TinkerPop’s optimizers) to the most efficient 
representation given their system’s unique capabilities.

Finally, you can see how traversal strategies/compilation works with .explain().

gremlin> 
g.V().has("name","marko").outE("knows").identity().inV().where(count().is(gt(10))).groupCount().by(both().count()).explain()
==>Traversal Explanation
===============================================================================================================================================================================================================================================================================================
Original Traversal                 [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],edge), IdentityStep, 
EdgeVertexStep(IN), TraversalFilterStep([CountGlobalStep, IsStep(gt(10))]), 
GroupCountStep([VertexStep(BOTH,vertex), CountGlobalStep])]

ConnectiveStrategy           [D]   [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],edge), IdentityStep, 
EdgeVertexStep(IN), TraversalFilterStep([CountGlobalStep, IsStep(gt(10))]), 
GroupCountStep([VertexStep(BOTH,vertex), CountGlobalStep])]
RangeByIsCountStrategy       [O]   [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],edge), IdentityStep, 
EdgeVertexStep(IN), TraversalFilterStep([RangeGlobalStep(0,11), 
CountGlobalStep, IsStep(gt(10))]), GroupCountStep([VertexStep(BOTH,vertex), 
CountGlobalStep])]
IdentityRemovalStrategy      [O]   [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],edge), EdgeVertexStep(IN), 
TraversalFilterStep([RangeGlobalStep(0,11), CountGlobalStep, IsStep(gt(10))]), 
GroupCountStep([VertexStep(BOTH,vertex), CountGlobalStep])]
IncidentToAdjacentStrategy   [O]   [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],vertex), 
TraversalFilterStep([RangeGlobalStep(0,11), CountGlobalStep, IsStep(gt(10))]), 
GroupCountStep([VertexStep(BOTH,vertex), CountGlobalStep])]
AdjacentToIncidentStrategy   [O]   [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],vertex), 
TraversalFilterStep([RangeGlobalStep(0,11), CountGlobalStep, IsStep(gt(10))]), 
GroupCountStep([VertexStep(BOTH,edge), CountGlobalStep])]
FilterRankingStrategy        [O]   [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],vertex), 
TraversalFilterStep([RangeGlobalStep(0,11), CountGlobalStep, IsStep(gt(10))]), 
GroupCountStep([VertexStep(BOTH,edge), CountGlobalStep])]
MatchPredicateStrategy       [O]   [GraphStep(vertex,[]), 
HasStep([name.eq(marko)]), VertexStep(OUT,[knows],vertex), 
TraversalFilterStep([RangeGlobalStep(0,11), CountGlobalStep, IsStep(gt(10))]), 
GroupCountStep([VertexStep(BOTH,edge), CountGlobalStep])]
TinkerGraphStepStrategy      [P]   [TinkerGraphStep(vertex,[name.eq(marko)]), 
VertexStep(OUT,[knows],vertex), TraversalFilterStep([RangeGlobalStep(0,11), 
CountGlobalStep, IsStep(gt(10))]), GroupCountStep([VertexStep(BOTH,edge), 
CountGlobalStep])]
ProfileStrategy              [F]   [TinkerGraphStep(vertex,[name.eq(marko)]), 
VertexStep(OUT,[knows],vertex), TraversalFilterStep([RangeGlobalStep(0,11), 
CountGlobalStep, IsStep(gt(10))]), GroupCountStep([VertexStep(BOTH,edge), 
CountGlobalStep])]
StandardVerificationStrategy [V]   [TinkerGraphStep(vertex,[name.eq(marko)]), 
VertexStep(OUT,[knows],vertex), TraversalFilterStep([RangeGlobalStep(0,11), 
CountGlobalStep, IsStep(gt(10))]), GroupCountStep([VertexStep(BOTH,edge), 
CountGlobalStep])]

Final Traversal                    [TinkerGraphStep(vertex,[name.eq(marko)]), 
VertexStep(OUT,[knows],vertex), TraversalFilterStep([RangeGlobalStep(0,11), 
CountGlobalStep, IsStep(gt(10))]), GroupCountStep([VertexStep(BOTH,edge), 
CountGlobalStep])]
gremlin>

Marko.

Reply via email to