[
https://issues.apache.org/jira/browse/GIRAPH-874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13957210#comment-13957210
]
Pavan Kumar commented on GIRAPH-874:
------------------------------------
I agree that primitive collections improve performance, but why do u say a new
"vertex object", has to be created to store in the map?
Vertex objects are already created and being assigned to partitions. However,
in giraph-873 this argument is valid because of line 130 in EdgeStore.java i.e.,
vertexIdEdgeIterator.releaseCurrentVertexId();
Can you please elaborate?
Also in the diff you can reduce all the duplication using delegation, for
example please look at GIRAPH-840 of how ByteCounter was split into
InBoundByteCounter, OutBoundByteCounter, through ByteCounterDelegate.
> Specialized byte array partitions
> ---------------------------------
>
> Key: GIRAPH-874
> URL: https://issues.apache.org/jira/browse/GIRAPH-874
> Project: Giraph
> Issue Type: Improvement
> Components: graph
> Affects Versions: 1.1.0
> Reporter: Craig Muchinsky
> Fix For: 1.1.0
>
> Attachments: GIRAPH-874-2.patch, GIRAPH-874.patch
>
>
> While doing some performance tuning I discovered that loading byte array
> partitions was performing slower than expected. I found that the extra time
> was being spent allocating a new vertex object for each distinct vertexId
> encountered (because vertexId object is the map key). Similar to GIRAPH-704,
> the use of primitive maps can provide significant performance benefit in this
> situation. By using a primitive map, the vertex object on the VertexIterator
> can be reused perpetually because the vertexId object isn't used as the map
> key.
> When processing a large graph with 4B vertices the worker vertices requests
> were taking ~15 seconds each, but after implementing the above suggestion
> that number dropped down sub-second.
--
This message was sent by Atlassian JIRA
(v6.2#6252)