[ https://issues.apache.org/jira/browse/GIRAPH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pavan Kumar updated GIRAPH-873: ------------------------------- Attachment: GIRAPH-873_refactor.patch here's the refactored code - ran pagerank on cluster successfully mvn clean verify on giraph-core > Specialized edge stores > ----------------------- > > Key: GIRAPH-873 > URL: https://issues.apache.org/jira/browse/GIRAPH-873 > Project: Giraph > Issue Type: Improvement > Affects Versions: 1.1.0 > Reporter: Craig Muchinsky > Assignee: Pavan Kumar > Fix For: 1.1.0 > > Attachments: GIRAPH-873-2.patch, GIRAPH-873.patch, > GIRAPH-873_refactor.patch > > > While doing some performance tuning I discovered that loading the edge store > can be a very expensive operation. Similar to GIRAPH-704, the use of > primitive maps can provide significant performance benefit. Part of the > benefit comes with the lower memory overhead associated with the primitive > maps however the larger benefit comes with the fact that you don't have to > release and reconstruct the vertexId object every time a new vertex is > encountered. > When processing a large graph with 4B vertices and 5B edges (3B of the edges > loaded via EdgeInputFormat) the worker edge requests were taking ~15 seconds > each, but after implementing the above suggestions that number dropped down > sub-second. -- This message was sent by Atlassian JIRA (v6.2#6252)