Hello,

One of the big outstanding features in TinkerPop3 is OLAP-based graph 
mutations. That is:

        graph database --read--> graph processor --write--> graph database

Daniel Kuppitz has provided BulkLoaderVertexProgram with "from scratch" or 
incremental loading capabilities. It would be nice to be able to piggy back on 
this work for OLAP mutations. Here is an idea that borrows from the OLAP 
mutation model developed by Matthias Broecheler in Faunus.

1. We create a MutatingProgram interface.
2. That interface will have a collection of public static final String compute 
keys.
        - gremlin.mutatingProgram.mutation
        - gremlin.mutatingProgram.droppedProperties
        - gremlin.mutatingProgram.addedProperties
3. Any VertexProgram can implement MutatingProgram and in doing so, it will use 
the respective compute keys.
4. If that VertexProgram deletes a vertex, it does not delete the vertex, it 
simple adds the property "gremlin.mutatingProgram.mutation=dropped."
5. If that VertexProgram adds an edge, it adds the edge with the property 
"gremlin.mutatingProgram.mutation=added."
6. If that VertexProgram adds a vertex property, it adds the vertex property 
with the property "gremlin.mutatingProgram.mutation=added."
7. If that VertexProgram deletes a property, it adds the property to the 
element "gremlin.mutatingProgram.droppedProperties=[x,y,z]"
8. ...
9. It is up to the VertexProgram to be smart about consistency on mutations:
        * If an edge is added, the next iteration should copy that edge to 
incoming/outgoing vertex's incident edge set.
        * If an edge property is added, the next iteration should update the 
property on the incoming/outgoing vertex's incident edge's property.
        * We can provide various static helper methods in MutatingProgram to 
make this easy.
10. When the VertexProgram has completed its computation (terminated), 
BulkLoaderVertexProgram will be able to read the resultant graph and use the 
MutatingProgram compute keys as necessary to do the respective updates to the 
source graph (i.e. the graph database).

---------------------

Problems:

        1. How do we deal with ID generation?
        2. How do we add vertices in OLAP?
        3. How do we deal with updates that already occurred at the graph 
database while OLAP was processing?

@kuppitz -- would this notion of a "mutation tags" be useful in 
BulkLoaderVertexProgram? I assume it would make your life much easier as you 
don't have to do a "diff" -- the diff is provided to you.

Thoughts on this matter would be much appreciated.

Thank you,
Marko.

http://markorodriguez.com

Reply via email to