[
https://issues.apache.org/jira/browse/GIRAPH-328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465926#comment-13465926
]
Eli Reisman commented on GIRAPH-328:
------------------------------------
Hi Avery, I'm posting a new patch (v6) that fixes the things you mentioned on
review board. This patch doesn't build as there is still a problem with the
GiraphConfiguration calls to create my I and M values. Here's the error I get.
This is why I think I used BspUtils in the previous patch. The error dump is
this (below). I will check this later when I have time but the calls are in
this patch if you think I'm calling the new methods wrong let me know.
{code}
[INFO] -------------------------------------------------------------
[ERROR] COMPILATION ERROR :
[INFO] -------------------------------------------------------------
[ERROR]
/home/computer/apache/giraph/target/munged/main/org/apache/giraph/comm/requests/SendPartitionCurrentMessagesRequest.java:[80,45]
incompatible types
found : org.apache.hadoop.io.WritableComparable
required: I
[ERROR]
/home/computer/apache/giraph/target/munged/main/org/apache/giraph/comm/requests/SendPartitionCurrentMessagesRequest.java:[87,52]
incompatible types
found : org.apache.hadoop.io.Writable
required: M
[INFO] 2 errors
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9.786s
[INFO] Finished at: Fri Sep 28 14:30:36 PDT 2012
[INFO] Final Memory: 17M/155M
[INFO] ------------------------------------------------------------------------
{code}
> Outgoing messages from current superstep should be grouped at the sender by
> owning worker, not by partition
> -----------------------------------------------------------------------------------------------------------
>
> Key: GIRAPH-328
> URL: https://issues.apache.org/jira/browse/GIRAPH-328
> Project: Giraph
> Issue Type: Improvement
> Components: bsp, graph
> Affects Versions: 0.2.0
> Reporter: Eli Reisman
> Assignee: Eli Reisman
> Priority: Minor
> Fix For: 0.2.0
>
> Attachments: GIRAPH-328-1.patch, GIRAPH-328-2.patch,
> GIRAPH-328-3.patch, GIRAPH-328-4.patch, GIRAPH-328-5.patch, GIRAPH-328-6.patch
>
>
> Currently, outgoing messages created by the Vertex#compute() cycle on each
> worker are stored and grouped by the partitionId on the destination worker to
> which the messages belong. This results in messages being duplicated on the
> wire per partition on a given receiving worker that has delivery vertices for
> those messages.
> By partitioning the outgoing, current-superstep messages by destination
> worker, we can split them into partitions at insertion into a MessageStore on
> the destination worker. What we trade in come compute time while inserting at
> the receiver side, we gain in fine grained control over the real number of
> messages each worker caches outbound for any given worker before flushing,
> and how those flush messages are aggregated for delivery as well.
> Potentially, it allows for a great reduction in duplicate messages sent in
> situations like Vertex#sendMessageToAllEdges() -- see GIRAPH-322, GIRAPH-314.
> You get the idea.
> This might be a poor idea, and it can certainly use some additional
> refinement, but it passes mvn verify and may even run ;) It interoperates
> with the disk spill code, but not as well as it could. Consider this a
> request for comment on the idea (and the approach) rather than a finished
> product.
> Comments/ideas/help welcome! Thanks
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira