[
https://issues.apache.org/jira/browse/GIRAPH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pavan Kumar updated GIRAPH-873:
-------------------------------
Attachment: GIRAPH-873_refactor.patch
here's the refactored code -
ran pagerank on cluster successfully
mvn clean verify on giraph-core
> Specialized edge stores
> -----------------------
>
> Key: GIRAPH-873
> URL: https://issues.apache.org/jira/browse/GIRAPH-873
> Project: Giraph
> Issue Type: Improvement
> Affects Versions: 1.1.0
> Reporter: Craig Muchinsky
> Assignee: Pavan Kumar
> Fix For: 1.1.0
>
> Attachments: GIRAPH-873-2.patch, GIRAPH-873.patch,
> GIRAPH-873_refactor.patch
>
>
> While doing some performance tuning I discovered that loading the edge store
> can be a very expensive operation. Similar to GIRAPH-704, the use of
> primitive maps can provide significant performance benefit. Part of the
> benefit comes with the lower memory overhead associated with the primitive
> maps however the larger benefit comes with the fact that you don't have to
> release and reconstruct the vertexId object every time a new vertex is
> encountered.
> When processing a large graph with 4B vertices and 5B edges (3B of the edges
> loaded via EdgeInputFormat) the worker edge requests were taking ~15 seconds
> each, but after implementing the above suggestions that number dropped down
> sub-second.
--
This message was sent by Atlassian JIRA
(v6.2#6252)