[jira] [Comment Edited] (TINKERPOP3-373) GraphComputer property persistence options

Marko A. Rodriguez (JIRA) Wed, 25 Mar 2015 10:53:27 -0700

    [ 
https://issues.apache.org/jira/browse/TINKERPOP3-373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14380344#comment-14380344
 ]


Marko A. Rodriguez edited comment on TINKERPOP3-373 at 3/25/15 5:52 PM:
------------------------------------------------------------------------

I've come up with two persistence enums for {{GraphComputer}}. I have 
implemented it for Giraph and Spark and it makes sense thus far. I think the 
only thing that might need to be added to {{Persist}} is {{VIEW}} so that its 
not written to the original/new graph, but provided as a view over that graph.

{code:title=GraphComputer.java|borderStyle=solid}
   public enum ResultGraph {
        /**
         * When the computation is complete, the {@link 
org.apache.tinkerpop.gremlin.structure.Graph} in {@link ComputerResult} is the 
original graph that spawned the graph computer.
         */
        ORIGINAL_GRAPH,
        /**
         * When the computation is complete, the {@link 
org.apache.tinkerpop.gremlin.structure.Graph} in {@link ComputerResult} is a 
new graph cloned from the original graph.
         */
        NEW_GRAPH
    }

    public enum Persist {
        /**
         * Write nothing to the declared {@link ResultGraph}.
         */
        NOTHING,
        /**
         * Write vertex and vertex properties back to the {@link ResultGraph}.
         */
        VERTEX_PROPERTIES,
        /**
         * Write vertex, vertex properties, and edges back to the {@link 
ResultGraph}.
         */
        EDGES
    }
{code}




was (Author: okram):
I've come up with two persistence enums for `GraphComputer`. I have implemented 
it for Giraph and Spark and it makes sense thus far. I think the only thing 
that might need to be added to `Persist` is `VIEW` so that its not written to 
the original/new graph, but provided as a view over that graph.

```
   public enum ResultGraph {
        /**
         * When the computation is complete, the {@link 
org.apache.tinkerpop.gremlin.structure.Graph} in {@link ComputerResult} is the 
original graph that spawned the graph computer.
         */
        ORIGINAL_GRAPH,
        /**
         * When the computation is complete, the {@link 
org.apache.tinkerpop.gremlin.structure.Graph} in {@link ComputerResult} is a 
new graph cloned from the original graph.
         */
        NEW_GRAPH
    }

    public enum Persist {
        /**
         * Write nothing to the declared {@link ResultGraph}.
         */
        NOTHING,
        /**
         * Write vertex and vertex properties back to the {@link ResultGraph}.
         */
        VERTEX_PROPERTIES,
        /**
         * Write vertex, vertex properties, and edges back to the {@link 
ResultGraph}.
         */
        EDGES
    }
```



> GraphComputer property persistence options
> ------------------------------------------
>
>                 Key: TINKERPOP3-373
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-373
>             Project: TinkerPop 3
>          Issue Type: Improvement
>          Components: process
>            Reporter: Matthias Broecheler
>            Assignee: Marko A. Rodriguez
>
> I noticed that the implicit assumption is that element compute keys (and the 
> values associated with them during the vertex computer iterations) are 
> persisted back into the graph after termination.
> Two thoughts on this:
> 1) There should be an option to disable this. For instance, one might want to 
> run PageRank and then run a map-reduce job to determine the 10 hightest 
> ranked vertices. If it is required that all PR values are being written back 
> into the graph that would become prohibitively expensive on large graphs.
> 2) It should be possible to define which subset of the elementComputeKeys one 
> wants to persist back into the graph. For instance, for PR one typically only 
> wants the PR value and not the edge-count
> ===> Vertex.getElementComputeKeys() should return a Map<String,Boolean> where 
> the boolean value indicates whether the value should be persisted back into 
> the graph. If all are false, then nothing is written back as required in (1).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (TINKERPOP3-373) GraphComputer property persistence options

Reply via email to