[ 
https://issues.apache.org/jira/browse/TINKERPOP-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15840347#comment-15840347
 ] 

ASF GitHub Bot commented on TINKERPOP-1617:
-------------------------------------------

GitHub user okram opened a pull request:

    https://github.com/apache/tinkerpop/pull/549

    TINKERPOP-1617: Create a SingleIterationStrategy which will do its best to 
rewrite OLAP traversals to not message pass.

    https://issues.apache.org/jira/browse/TINKERPOP-1617
    
    There are various traversals that can be rewritten using `local()` that 
will enable the `GraphComputer` to avoid a message pass and thus, can 
accomplish the computation in a single scan of the graph. Benefiting traversal 
examples include:
    
    ```
    g.V().out().id() --> g.V().local(out().id())
    g.V().out().id().count() --> g.V().local(out().id()).count()
    g.V().out().id().dedup().count()
    g.V().inE().values("weight") // realize that in-edges are hosted by the 
out-vertex
    g.V().inE().values("weight").sum()
    g.V().both().count()
    g.V().inE().count()
     g.V().as("a").outE().inV().as("b").id().dedup("a", "b").by(T.id).count()
    ```
    
    Finally, the traversal that sparked this PR:
    
    ```
    g.V().in().id().select("articleNumber").dedup().count()    // requires one 
message pass
    
    ==translatesTo==>
    
    g.V().local(in().id().select("articleNumber")).dedup().count() // requires 
no message passing
    ```
    
    `SingleIterationStrategy` plays well with `SparkSingleIterationStrategy` 
which determines whether it is necessary to `cache()` and/or `partition()` the 
graph. If the traversal can be accomplished without a message pass (i.e. a 
single iteration), then performance is greatly improved as RDD partitions can 
be dropped as they are processed sequentially.
    
    VOTE +1.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/tinkerpop TINKERPOP-1617

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/tinkerpop/pull/549.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #549
    
----

----


> Create a SingleIterationStrategy which will do its best to rewrite OLAP 
> traversals to not message pass.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-1617
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1617
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.2.3
>            Reporter: Marko A. Rodriguez
>            Assignee: Marko A. Rodriguez
>
> The traversal:
> {code}
> g.V().out().id().count()
> {code}
> Requires a message pass from {{out()}}. We shouldn't do this. Instead, if we 
> wrap the pre-barrier stage into a {{local()}}, we have:
> {code}
> g.V().local(out().id()).count()
> {code}
> ...which doesn't require a message pass and has the same semantics. This will 
> help open up numerous OLAP type traversals to single-pass/non-caching scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to