Hello, This work has been completed and merged into master/.
https://github.com/apache/incubator-tinkerpop/pull/243 This work was well worth it -- even though we have breaking changes to deal with. GraphComputer providers (non-trivial changes) http://tinkerpop.apache.org/docs/3.2.0-SNAPSHOT/upgrade/#_graphcomputer_semantics_and_api Users (trivial changes if any) http://tinkerpop.apache.org/docs/3.2.0-SNAPSHOT/upgrade/#_vertexprogram_and_memorycomputekey_and_vertexcomputekey Here is what we have gained: 1. MemoryComputeKeys and VertexComputeKeys can be transient. - e.g. No more EDGE_COUNT properties left on the vertices after executing PageRankTraversalVertexProgram. 2. MemoryComputeKeys can be set to NOT broadcast. - e.g. If the workers never need to read a memory value (only add to it), then broadcasting can be turned off. 3. Gremlin OLAP now fully supports OLTP->OLAP->OLTP->OLAP->etc. - When barriers are reached (e.g. groupCount(), count(), sum(), etc.), processing becomes local to the master traversal. - When the master traversal starts to touch elements again (vertices/edges/properties), it sends the traversers back to the workers. - This process of parallel->sequential->parallel->… can go on indefinitely. #3 is the biggest boon. Gremlin OLTP and Gremlin OLAP can now execute all the same traversals -- save for the following exceptions: 1. by()-modulators in OLAP can not leave the local star graph. (as before) 2. path processors (e.g. path(), select()) by()-modulators can only touch element ids. (as before) 3. --- there are a couple more that are currently not allowed because of semantics issues in OLTP! that are valid in OLAP :) Now you can do complex, nested, multi-barrier, etc. OLAP traversals. gremlin> g = TinkerFactory.createModern().traversal().withComputer() ==>graphtraversalsource[tinkergraph[vertices:6 edges:6], tinkergraphcomputer] gremlin> g.V().group().by(label). select("person").unfold(). groupCount().by(bothE().count()). select(keys).sum(local) ==>4 gremlin> And, yes, this is one big TraversalVertexProgram! gremlin> g.V().group().by(label).select("person").unfold().groupCount().by(bothE().count()).select(keys).sum(local).iterate().toString() ==>[TraversalVertexProgramStep([GraphStep([],vertex), GroupStep(label,[FoldStep]), SelectOneStep(person), UnfoldStep, GroupCountStep([VertexStep(BOTH,edge), CountGlobalStep]), LambdaMapStep(keys), SumLocalStep]), ComputerResultStep] gremlin> Even though we have multiple barriers -- group() and groupCount(), TraversalVertexProgram is smart about how to converge barriers into "OLTP streams" and back again. Its actually all very clean and simple. Enjoy, Marko. http://markorodriguez.com On Feb 18, 2016, at 3:57 PM, Marko Rodriguez <okramma...@gmail.com> wrote: > Hi people, > > Here is a ticket that I think we should strongly consider. > > https://issues.apache.org/jira/browse/TINKERPOP-1166 (in particular > read my last comment for a clean breakdown) > > This would be an API breaking change for both users (who write > VertexPrograms) and providers (who have their own GraphComputer > implementation). > > * If you are a user and don't have any VertexProgram implementations, this > will not effect you save for performance gains. > * If you are a graph system provider that does not have a custom > GraphComputer (e.g. you rely on SparkGraphComputer for instance), this will > not effect you either. > > If you do write VertexPrograms, it will require you to go through your > VertexProgram and change all your memory.xxx() calls. Here are the stats on > the main VertexPrograms TinkerPop has: > PageRankVertexProgram -- 0 memory calls. > PeerPressureVertexProgram -- 3 memory calls. (could be 2 if I was smart > organizing my code) > TraversalVertexProgram -- 3 memory calls. > Thus, its not a big rewrite. It will be simply changing, for example, > "memory.and("vote",true) to memory.add("vote",true)" .. thats it. No more > incr(), sum(), etc. methods. You just .add(). > > If you do have a custom GraphComputer, it will require you to rewrite your > Memory implementation. The logic is basically the same (nothing you can't > already express with your system now), but the API will be different, though > less methods required. Finally, I will add a > GraphComputerTest.shouldRespectTransientKeys() that will make sure transient > memory and compute keys are purged prior to returning the ComputerResult. > > Please review the proposed changes and provide your feedback. I don't think > we will be able to make this a backwards compatible change so, please think > hard. > > Thanks, > Marko. > > http://markorodriguez.com >