[ https://issues.apache.org/jira/browse/GIRAPH-83?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150094#comment-13150094 ]
Jakob Homan commented on GIRAPH-83: ----------------------------------- Looking at the original Pregel paper, the Vertex instance has eight methods (compute, vertex_id, superstep, GetValue, MutableValue, GetOutEdgeIterator, SendMessageTo and VoteToHalt). Currently, BasicVertex has 24. There are also three different types of Vertices (Vertex, MutableVertex and BasicVertex) linked via inheritance and exposed to the users. I'm wondering if this interface is quite right yet. There are two main concerns: one, this is the contract users are starting to write applications against and which we'll need to support for a long time, with as few tweaks as possible. It'd be good to be relatively sure of its limits before we make an initial release. Second, the use of inheritance to join the user's implementation with the computation's state makes it difficult to test. How does one mock out the state that's fed into compute and verify compute's result without starting up a cluster (either real or local; see GIRAPH-51). Would it be reasonable to strip out as many methods as possible from Vertex, particularly those dealing with state external to the Vertex itself: * getSuperStep * getNumVertices * getNumEdges * getMsgList/iterator * getEdgeValue * hasEdge * sendMsg * sendMsgToAllEdges * (g|s)etGraphState * getContext * getWorkerContext * registerAggregator * useAggregator The outEdges data structures are a bit odd in that they are intrinsic to the vertex itself (in the mathematical sense), but are managed by the framework. It might be a bit clunky, but structurally more correct to separate these out as well. These methods and the state they manipulate could then be passed in as a Context (a new type of Context, not one of the two others we have running around!) to the compute method. This moves compute() closer to a functional, testing model of computing across its input state (which can be mocked out for testing and mangled as we evolve its innards). The Vertex itself could still of course maintain any state it would need, but like a Mapper, shouldn't need much and would be discouraged from holding onto larges amounts of data between computations. Thoughts? > Is Vertex correct yet? > ---------------------- > > Key: GIRAPH-83 > URL: https://issues.apache.org/jira/browse/GIRAPH-83 > Project: Giraph > Issue Type: Improvement > Reporter: Jakob Homan > > I'm seeing a number of people run into oddities with Vertex and am thinking > we may not have it quite correct yet... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira