[ https://issues.apache.org/jira/browse/GIRAPH-244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alessandro Presta updated GIRAPH-244: ------------------------------------- Attachment: GIRAPH-244.patch > Vertex API redesign > ------------------- > > Key: GIRAPH-244 > URL: https://issues.apache.org/jira/browse/GIRAPH-244 > Project: Giraph > Issue Type: Improvement > Reporter: Alessandro Presta > Assignee: Alessandro Presta > Attachments: GIRAPH-244.patch, GIRAPH-244.patch, GIRAPH-244.patch, > GIRAPH-244.patch, GIRAPH-244.patch, GIRAPH-244.patch, GIRAPH-244.patch > > > This is an effort to rationalize the Giraph API. I've put together a few > issues that we've talked about lately. I'm focusing on making Giraph > development even more intuitive and less error-prone, and fixing a few > potential sources of bugs. > I'm sorry this is a big patch, but most of those issues are intertwined and I > think this might be easier to review and integrate. > Here's an account of the changes: > Vertex API: > - Renamed BasicVertex to Vertex (as I understand, we used to have both and > then Vertex was removed). > - Switched to Iterables instead of Iterators for both edges and messages. > This makes code more concise for both implementors (no need to call > .iterator() on collections) and users (can use foreach syntax). See also > GIRAPH-221. > - Added SimpleVertex and SimpleMutableVertex classes, where there are no edge > values and the iterable to be implemented is getNeighbors(). We don’t have > multiple inheritance, so the only way I could think of was to have > SimpleVertex extend Vertex, SimpleMutableVertex extend MutableVertex, and > duplicate the code for the edges iterables. > Also, due to type erasure, one still has to deal with Edge objects in > SimpleMutableVertex#initialize. Overall I think this is still an improvement > over the current situation. > - Added id and value field to the base Vertex class. All other classes were > either writing the same boilerplate again and again, or using primitive > fields and then creating Writables on the fly (inefficient; there was even a > TODO about that). If there are any actually useful customizations here, I’ve > yet to see them. > Also removed redundant “Vertex” from getters/setters (compare vertex.getId() > with vertex.getVertexId()). > - Made halt a private field, and added a wakeUp() method to re-activate a > vertex. isHalted()/voteToHalt()/wakeUp() are just more semantically-charged > getter/setters. > - Renamed number of vertices/edges in graph to getTotalNum*. The previous > naming (getNumEdges) was arguably confusing. If this one sucks too, please > suggest a better one. > - Default implementations of hasEdge(), getEdgeValue(), getNumEdges(), > readFields(), write(), toString(): the implementor can still optimize when > there is a good opportunity. Currently we are duplicating a lot of code (see > GIRAPH-238) and potentially introducing bugs (see GIRAPH-239). > HashMapVertex: > - Switched representation from Map<I, Edge<I, E>> to Map<I, E> (GIRAPH-242) > - Only override methods that can be optimized. > EdgeListVertex: > - Switched representation from two sorted lists to one list of Edge<I, E> > (see GIRAPH-243). Mainly this makes iteration over edges (target id and > value) linear instead of O(n log n). Mutations are still slow and should > generally be discouraged. > - Only override methods that can be optimized. > Small nits: > - Our code conventions say we should try to avoid abbreviations, so I > eliminated a few (req -> request, msg -> message). > - Unilaterally refer to the endpoint of an edge as targetVertex (before we > had a mix of destVertex and targetVertex). > - You will notice some rearranged imports. That’s just my IDE trying to be > helpful (see GIRAPH-230). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira