[jira] [Commented] (GIRAPH-249) Move part of the graph out-of-core when memory is low

Alessandro Presta (JIRA) Fri, 13 Jul 2012 09:11:37 -0700

    [ 
https://issues.apache.org/jira/browse/GIRAPH-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413845#comment-13413845
 ]


Alessandro Presta commented on GIRAPH-249:
------------------------------------------

We're not currently combining on the server (although there is a patch: 
GIRAPH-224). Anyway, messages are accumulated in that structure for 
synchronization purposes and because we can't call putMessages() until we have 
all of them. This will likely change with out-of-core messages, since we would 
skip the assignment part and keep the "transient messages" (replaced by a 
disk-backed store) around to pass them directly to compute().

I'm trying to figure out the exact path of vertices from input splits to the 
owner's partition map. From exchangeVertexPartitions/movePartitionsToWorker it 
looks like we fill workerPartitionMap only after all the exchanging is done. 
From sendWorkerPartitions, however, it seems that the outgoing partitions are 
in the sender's workerPartitionMap, which would solve our problems.
I need to take some time to understand that part, but if you're able to shed 
some light, even better.

Also, if you have time, you could try running your experiments with the current 
patch. If it doesn't improve the input split phase, then we've got the answer 
right there :)
I'll start running benchmarks some time soon.

Thanks a lot for the feedback.
                
> Move part of the graph out-of-core when memory is low
> -----------------------------------------------------
>
>                 Key: GIRAPH-249
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-249
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Alessandro Presta
>            Assignee: Alessandro Presta
>         Attachments: GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch, 
> GIRAPH-249.patch
>
>
> There has been some talk about Giraph's scaling limitations due to keeping 
> the whole graph and messages in RAM.
> We need to investigate methods to fall back to disk when running out of 
> memory, while gracefully degrading performance.
> This issue is for graph storage. Messages should probably be a separate 
> issue, although the interplay between the two is crucial.
> We should also discuss what are our primary goals here: completing a job 
> (albeit slowly) instead of failing when the graph is too big, while still 
> encouraging memory optimizations and high-memory clusters; or restructuring 
> Giraph to be as efficient as possible in disk mode, making it almost a 
> standard way of operating.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (GIRAPH-249) Move part of the graph out-of-core when memory is low

Reply via email to