[ 
https://issues.apache.org/jira/browse/GIRAPH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavan Kumar updated GIRAPH-873:
-------------------------------

    Attachment: GIRAPH-873_refactor.patch

here's the refactored code - 
ran pagerank on cluster successfully
mvn clean verify on giraph-core


> Specialized edge stores
> -----------------------
>
>                 Key: GIRAPH-873
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-873
>             Project: Giraph
>          Issue Type: Improvement
>    Affects Versions: 1.1.0
>            Reporter: Craig Muchinsky
>            Assignee: Pavan Kumar
>             Fix For: 1.1.0
>
>         Attachments: GIRAPH-873-2.patch, GIRAPH-873.patch, 
> GIRAPH-873_refactor.patch
>
>
> While doing some performance tuning I discovered that loading the edge store 
> can be a very expensive operation. Similar to GIRAPH-704, the use of 
> primitive maps can provide significant performance benefit. Part of the 
> benefit comes with the lower memory overhead associated with the primitive 
> maps however the larger benefit comes with the fact that you don't have to 
> release and reconstruct the vertexId object every time a new vertex is 
> encountered.
> When processing a large graph with 4B vertices and 5B edges (3B of the edges 
> loaded via EdgeInputFormat) the worker edge requests were taking ~15 seconds 
> each, but after implementing the above suggestions that number dropped down 
> sub-second.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to