[ 
https://issues.apache.org/jira/browse/TINKERPOP-1163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15213084#comment-15213084
 ] 

ASF GitHub Bot commented on TINKERPOP-1163:
-------------------------------------------

GitHub user okram opened a pull request:

    https://github.com/apache/incubator-tinkerpop/pull/278

    TINKERPOP-1163: GraphComputer's can have TraversalStrategies.

    https://issues.apache.org/jira/browse/TINKERPOP-1163
    
    GraphComputers can now have their own `TraversalStrategy` registrations in 
the global cache. Currently, as it stands, all that is registered is 
`GraphComputer.class` which has `PathProcessStrategy`, `OrderLimitStrategy`, 
`ComputerVerificationStrategy`. Moving forward, we will be able to have 
strategies like `SparkCountStrategy` which will convert `g.V().count()` into 
`inputRDD.count()` and thus, allow us to talk more directly to the 
`GraphComputer` engine. `TinkerCountStrategy` would do `g.V().count()` as 
`this.vertices.count()`. Blazin'. .... however, what we have here is the the 
infrastructure to allow for the distinction between `Graph` and `GraphComputer` 
strategies. Note that this PR is backwards compatible.
    
    CHANGELOG
    
    ```
    * `TraversalStrategies.GlobalCache` supports both `Graph` and 
`GraphComputer` strategy registrations.
    ```
    
    VOTE +1.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/incubator-tinkerpop TINKERPOP-1163

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tinkerpop/pull/278.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #278
    
----
commit 718caa6be1f722923aa3c23aae9175cecc6ad11a
Author: Marko A. Rodriguez <okramma...@gmail.com>
Date:   2016-03-26T16:07:41Z

    TraversalStrategies.GlobalCache has two caches now -- one for Graphs and 
one for GraphComputers. For 3.2.0, this simply allows us to partition the 
strategies so that the 3 GraphComputer strategies we have are never called in 
OLTP (saving clock cylces). In the future, it will enable us to have something 
like SparkCountStrategy which will just do inputRDD.count() for g.V().count() 
instead of going through the rigamorole of TraversalVertexProgram.

----


> GraphComputer's can have TraversalStrategies.
> ---------------------------------------------
>
>                 Key: TINKERPOP-1163
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1163
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: hadoop, process
>    Affects Versions: 3.1.0-incubating
>            Reporter: Marko A. Rodriguez
>
> @dkuppitz makes the joke that he can count the number of vertices in the 
> Friendster adjacency list with "awk to the sed to the bash to the.." in < 1 
> minute. SparkGraphComputer with four blades takes ~5 minutes.
> What's the dealio?
> Imagine a world where {{SparkGraphComputerStrategy}} exists. It analyzes 
> traversals and does fast executions breaking away from the VertexProgram API 
> and going strait to the native API of Spark. Check it:
> {code}
> g.V().count() -> inputRDD.count()
> {code}
> ...add a {{EmptyVertex.instance()}} manipulation to the respective 
> InputFormats and you are just then skipping through bytes not manifesting 
> objects at all. BAM. That would take 30 seconds on Friendster.
> {code}
> g.V().outE('knows').count() --> 
> inputRDD.flatMapToPair{edgeComponents}.filter{knows}.count()
> {code}
> Blazing fast.
> ....for all those standard patterns, we just do a "native" execution for the 
> respective GraphComputer engine. We sideStep object creation, iteration 
> phases, views, map reduce jobs.... However, we have to be smart to update the 
> {{Memory}} so it looks as if the real VertexProgram executed! --- 
> {{iteration}}, {{runtime}}, {{~reducing}}, etc.
> Genius.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to