[jira] [Closed] (TINKERPOP-2753) Create noop() step to avoid eager optimization

Stephen Mallette (Jira) Thu, 16 Jun 2022 03:40:05 -0700


     [ 
https://issues.apache.org/jira/browse/TINKERPOP-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Stephen Mallette closed TINKERPOP-2753.
---------------------------------------
    Resolution: Won't Do

>  except I would assume the `noop()` step cannot be used as a terminal step.

{{identity()}}} is not a terminal step so it seems they are identical.

> I would still consider it as a workaround, though. 

I think that if you're smart enough about your graph and Gremlin to know when 
to short-circuit an optimization, you should probably know how strategies 
affect the behavior of your traversal. Removing them to achieve some gain that 
they won't bring is just part of writing Gremlin in that case. In that sense, 
it feels like less of a workaround to me but perhaps it's because I've used and 
recommended this approach before to solve this issue. glad this approach works 
for you.

> Create noop() step to avoid eager optimization
> ----------------------------------------------
>
>                 Key: TINKERPOP-2753
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2753
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.6.0
>            Reporter: Boxuan Li
>            Priority: Major
>
> I only have experience in JanusGraph, so my opinion might be biased and this 
> proposal might not be generalizable to other graph providers:
> I propose we create a `noop()` step that does nothing. It is a special step 
> that simply provides a hint for the graph provider. How to interpret it 
> depends on the graph provider, but the usage in my mind is to avoid eager 
> optimization. Sometimes a graph provider can combine different filter steps 
> into a joint condition for better index selection or predicate pushdown. For 
> example, in the query below:
>  
> {code:java}
> g.V().has("name", "bob").has("age", 20){code}
>  
> JanusGraph will fold the two `has` conditions into a joint condition for 
> better index selection. Sometimes, however, users don't want this "eager 
> optimization", likely because they know the distribution of data and prefer 
> doing in-memory filtering for the second `has` condition. They could do this:
>  
> {code:java}
> g.V().has("name", "bob").map(x -> x.get()).has("age", 20){code}
>  
> So that JanusGraph will defer the evaluation of the second condition until 
> the first `has` condition is evaluated. Here, the `map(x -> x.get())` is 
> essentially a noop step. What I am proposing is to use an official `noop()` 
> step to replace this workaround. This `noop` step sounds like a `barrier` 
> step but they do not have the same semantics. The `noop` step is a barrier 
> against constraint look-ahead optimization.
>  
> Another example usage of `noop` is as follows:
>  
> {code:java}
> g.V(ids).bothE("follows").noop().where(__.otherV().is(v2)).next(){code}
>  
> In the above case, we can use `noop` to force the graph provider to compute 
> `bothE` first and then evaluate `where` statement. Otherwise, the graph 
> provider (for example, JanusGraph) might try folding the `where` condition 
> into the `bothE` step for predicate pushdown. Predicate pushdown usually 
> works, but in some scenarios, it is less preferred.
>  
> I am happy to provide a patch if the community likes this idea.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

[jira] [Closed] (TINKERPOP-2753) Create noop() step to avoid eager optimization

Reply via email to