[ https://issues.apache.org/jira/browse/GIRAPH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957243#comment-13957243 ]
Pavan Kumar commented on GIRAPH-873: ------------------------------------ In the diff you can reduce all the duplication using delegation, for example please look at GIRAPH-840 of how ByteCounter was split into InBoundByteCounter, OutBoundByteCounter, through ByteCounterDelegate. For instance move methods that are exactly same except for LongWritable or IntWritable create another class with generic type like <? extends Writable> and create specific delegate instances of LongWritable/ IntWritable from respective classes. But great work, thanks. > Specialized edge stores > ----------------------- > > Key: GIRAPH-873 > URL: https://issues.apache.org/jira/browse/GIRAPH-873 > Project: Giraph > Issue Type: Improvement > Affects Versions: 1.1.0 > Reporter: Craig Muchinsky > Fix For: 1.1.0 > > Attachments: GIRAPH-873-2.patch, GIRAPH-873.patch > > > While doing some performance tuning I discovered that loading the edge store > can be a very expensive operation. Similar to GIRAPH-704, the use of > primitive maps can provide significant performance benefit. Part of the > benefit comes with the lower memory overhead associated with the primitive > maps however the larger benefit comes with the fact that you don't have to > release and reconstruct the vertexId object every time a new vertex is > encountered. > When processing a large graph with 4B vertices and 5B edges (3B of the edges > loaded via EdgeInputFormat) the worker edge requests were taking ~15 seconds > each, but after implementing the above suggestions that number dropped down > sub-second. -- This message was sent by Atlassian JIRA (v6.2#6252)