[jira] [Updated] (GIRAPH-256) Partitioning outgoing graph data during INPUT_SUPERSTEP by # of vertices results in wide variance in RPC message sizes

Eli Reisman (JIRA) Sat, 14 Jul 2012 15:14:36 -0700

     [ 
https://issues.apache.org/jira/browse/GIRAPH-256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Eli Reisman updated GIRAPH-256:
-------------------------------

    Attachment: GIRAPH-256-3.patch

Quick fix to patch file (using git, forgot '--no-prefix') and quick tune-up to 
make monitoring of candidates for transfer a bit more efficient and accurate. 
As stated in GIRAPH-247, avoiding frequent calls to partition.getEdgeCount() is 
a big efficiency win.

Passes mvn verify, cluster use, etc.
                
> Partitioning outgoing graph data during INPUT_SUPERSTEP by # of vertices 
> results in wide variance in RPC message sizes
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-256
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-256
>             Project: Giraph
>          Issue Type: Improvement
>          Components: bsp, graph
>    Affects Versions: 0.2.0
>            Reporter: Eli Reisman
>            Assignee: Eli Reisman
>              Labels: patch
>             Fix For: 0.2.0
>
>         Attachments: GIRAPH-256-1.patch, GIRAPH-256-2.patch, 
> GIRAPH-256-3.patch
>
>
> This relates to GIRAPH-247. The unfortunately named 
> "MAX_VERTICES_PER_PARTITION" fooled me into thinking this value was 
> regulating the size of initial Partition objects as they were composed during 
> INPUT_SUPERSTEP from InputSplits each worker reads.
> In fact this configuration option only regulates the size of the outgoing RPC 
> messages, stored locally in Partition objects but decomposed into Collections 
> of BasicVertex for transfer to their eventual homes on another (or this) 
> worker. There they are combined into the actual Partitions they will exist in 
> for the job run.
> By partitioning these outgoing messages by # of vertices, metrics load tests 
> have shown the size of the average message is not well regulated and can 
> create overloads on either side of these transfers. This is important because:
> 1. Throughput and memory are at a premium during INPUT_SUPERSTEP.
> 2. Only one crashed worker in a Giraph job causes cascading job failure, even 
> in an otherwise healthy workflow.
> This JIRA renames the offending variables/config options and further 
> regulates outgoing graph data in INPUT_SUPERSTEP by the # of edges and THEN 
> the # of vertices in a candidate for transfer. This much more effectively 
> regulates message size for typical social graph data and has been show in 
> testing to greatly improve the amount of load-in data Giraph can handle 
> without failure given fixed memory and worker limits.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (GIRAPH-256) Partitioning outgoing graph data during INPUT_SUPERSTEP by # of vertices results in wide variance in RPC message sizes

Reply via email to