[ 
https://issues.apache.org/jira/browse/GIRAPH-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290341#comment-15290341
 ] 

Hassan Eslami commented on GIRAPH-1066:
---------------------------------------

https://reviews.facebook.net/D55479

> Functional adaptive out-of-core mechanism
> -----------------------------------------
>
>                 Key: GIRAPH-1066
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-1066
>             Project: Giraph
>          Issue Type: New Feature
>          Components: bsp, graph
>            Reporter: Hassan Eslami
>            Assignee: Hassan Eslami
>
> In this JIRA we propose the following contributions to the out-of-core 
> mechanism:
> • A simpler API is provided to try various out-of-core policies using the 
> basic infrastructure proposed in GIRAPH-1048. This new API helps developers 
> of out-of-core policies to only focus on the out-of-core logic, rather than 
> the complications in multi-threading, disk interactions, etc. The policy 
> logic is abstracted out as much as possible to make it as simple as possible 
> to develop and try other out-of-core policies.
> • Two adaptive out-of-core policies are implemented using the proposed API. 
> One is based on few recent GC behaviors, and the other is based on some 
> user-defined thresholds to control the memory pressure. With the adaptive 
> out-of-core policies, the job automatically uses secondary storage devices in 
> case the data cannot fit into memory. Also, if at some point in the 
> computation the memory pressure goes down, the spilled data to secondary 
> storage will be automatically loaded to memory again.
> • The out-of-core infrastructure is integrated with message flow control 
> proposed in GIRAPH-1027. Using credit-based flow control, an out-of-core 
> policy can predict the amount of memory usage by messages in a near future, 
> hence the policy can have a fine control over messages and their memory 
> footprint.
> • A new feature, called data generation tethering, is also added. This 
> feature let the out-of-core policy to decide how many threads (input/compute) 
> should be active at each moment, indirectly controlling the rate of data 
> generation, and in turn, controlling the memory footprint of graph data.
> With this JIRA landed, we will have a full-functional out-of-core 
> infrastructure preventing any reasonable job to fail due to OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to