Hello,

This work has been completed and merged into master/.

        https://github.com/apache/incubator-tinkerpop/pull/243

This work was well worth it -- even though we have breaking changes to deal 
with.

GraphComputer providers (non-trivial changes)
        
http://tinkerpop.apache.org/docs/3.2.0-SNAPSHOT/upgrade/#_graphcomputer_semantics_and_api
Users (trivial changes if any)
        
http://tinkerpop.apache.org/docs/3.2.0-SNAPSHOT/upgrade/#_vertexprogram_and_memorycomputekey_and_vertexcomputekey

Here is what we have gained:

        1. MemoryComputeKeys and VertexComputeKeys can be transient.
                - e.g. No more EDGE_COUNT properties left on the vertices after 
executing PageRankTraversalVertexProgram.
        2. MemoryComputeKeys can be set to NOT broadcast.
                - e.g. If the workers never need to read a memory value (only 
add to it), then broadcasting can be turned off.
        3. Gremlin OLAP now fully supports OLTP->OLAP->OLTP->OLAP->etc.
                - When barriers are reached (e.g. groupCount(), count(), sum(), 
etc.), processing becomes local to the master traversal.
                - When the master traversal starts to touch elements again 
(vertices/edges/properties), it sends the traversers back to the workers.
                - This process of parallel->sequential->parallel->… can go on 
indefinitely.
        
#3 is the biggest boon. Gremlin OLTP and Gremlin OLAP can now execute all the 
same traversals -- save for the following exceptions:

        1. by()-modulators in OLAP can not leave the local star graph. (as 
before)
        2. path processors (e.g. path(), select()) by()-modulators can only 
touch element ids. (as before)
        3. --- there are a couple more that are currently not allowed because 
of semantics issues in OLTP! that are valid in OLAP :)

Now you can do complex, nested, multi-barrier, etc. OLAP traversals.

gremlin> g = TinkerFactory.createModern().traversal().withComputer() 
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], tinkergraphcomputer]
gremlin> g.V().group().by(label).
                 select("person").unfold().
               groupCount().by(bothE().count()).
                 select(keys).sum(local) 
==>4
gremlin>

And, yes, this is one big TraversalVertexProgram!

gremlin> 
g.V().group().by(label).select("person").unfold().groupCount().by(bothE().count()).select(keys).sum(local).iterate().toString()
==>[TraversalVertexProgramStep([GraphStep([],vertex), 
GroupStep(label,[FoldStep]), SelectOneStep(person), UnfoldStep, 
GroupCountStep([VertexStep(BOTH,edge), CountGlobalStep]), LambdaMapStep(keys), 
SumLocalStep]), ComputerResultStep]
gremlin>

Even though we have multiple barriers -- group() and groupCount(), 
TraversalVertexProgram is smart about how to converge barriers into "OLTP 
streams" and back again. Its actually all very clean and simple.

Enjoy,
Marko.

http://markorodriguez.com

On Feb 18, 2016, at 3:57 PM, Marko Rodriguez <okramma...@gmail.com> wrote:

> Hi people,
> 
> Here is a ticket that I think we should strongly consider.
> 
>       https://issues.apache.org/jira/browse/TINKERPOP-1166 (in particular 
> read my last comment for a clean breakdown)
>       
> This would be an API breaking change for both users (who write 
> VertexPrograms) and providers (who have their own GraphComputer 
> implementation).
> 
> * If you are a user and don't have any VertexProgram implementations, this 
> will not effect you save for performance gains.
> * If you are a graph system provider that does not have a custom 
> GraphComputer (e.g. you rely on SparkGraphComputer for instance), this will 
> not effect you either.
> 
> If you do write VertexPrograms, it will require you to go through your 
> VertexProgram and change all your memory.xxx() calls. Here are the stats on 
> the main VertexPrograms TinkerPop has:
>       PageRankVertexProgram -- 0 memory calls.
>       PeerPressureVertexProgram -- 3 memory calls. (could be 2 if I was smart 
> organizing my code)
>       TraversalVertexProgram -- 3 memory calls.
> Thus, its not a big rewrite. It will be simply changing, for example, 
> "memory.and("vote",true) to memory.add("vote",true)" .. thats it. No more 
> incr(), sum(), etc. methods. You just .add().
> 
> If you do have a custom GraphComputer, it will require you to rewrite your 
> Memory implementation. The logic is basically the same (nothing you can't 
> already express with your system now), but the API will be different, though 
> less methods required. Finally, I will add a 
> GraphComputerTest.shouldRespectTransientKeys() that will make sure transient 
> memory and compute keys are purged prior to returning the ComputerResult.
> 
> Please review the proposed changes and provide your feedback. I don't think 
> we will be able to make this a backwards compatible change so, please think 
> hard.
> 
> Thanks,
> Marko.
> 
> http://markorodriguez.com
> 

Reply via email to