[ https://issues.apache.org/jira/browse/GIRAPH-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194615#comment-15194615 ]
Hassan Eslami commented on GIRAPH-1048: --------------------------------------- https://reviews.facebook.net/D54549 > Redesign of out-of-core mechanism (first patch -- out-of-core mechanism > keeping fixed number of partitions in memory) > --------------------------------------------------------------------------------------------------------------------- > > Key: GIRAPH-1048 > URL: https://issues.apache.org/jira/browse/GIRAPH-1048 > Project: Giraph > Issue Type: New Feature > Reporter: Hassan Eslami > Assignee: Hassan Eslami > Labels: out-of-memory > > The current out-of-core mechanism implemented in Giraph suffers from a few > issues: > - It does not integrate well with a flow-control mechanism in which rate of > incoming/outgoing messages are controlled according to available memory, > - It does not control data generation/processing rate by compute/input > threads, which is crucial in input superstep, and also compute supersteps in > some applications, > - It does not utilize the disk bandwidth properly due to concurrent disk > accesses (IO interference), > - It suffers from high overhead due to successive manual GC calls, even when > the high-memory pressure cannot be addressed by offloading data to disk, > - And yet, it has a complicated design making it difficult to debug and > improve upon. > - It is very difficult to try different out-of-core policies, making it > impossible to tune the mechanism. > A simple to tune/program, flexible, and yet efficient out-of-core > infrastructure is needed in Giraph. In this JIRA we propose a redesign of > out-of-core mechanism, in which a) the logic of IO operations, b) the logic > of out-of-core decisions, c) data-structures supporting out-of-core > operations, and d) the actual logic for the computation are 4 different > decoupled entities. Some IOCommands and an IOScheduler address the logic > behind IO operations, an OutOfCoreEngine and a MetaPartitionManager address > the logic for out-of-core decisions, several disk-backed data-structures are > responsible to keep necessary data, and finally, the old in-memory > computation mechanism interact with the out-of-core infrastructure seamlessly. > This JIRA is created to set the ground for the out-of-core infrastructure, > and as an initial proof-of-concept, a simple out-of-core policy using the > mentioned infrastructure is implemented. The out-of-core policy in this JIRA, > also called fixed out-of-core policy, tries to keep a certain (user defined) > number of partitions in memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)