[ 
https://issues.apache.org/jira/browse/GIRAPH-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13415530#comment-13415530
 ] 

Alessandro Presta commented on GIRAPH-249:
------------------------------------------

Eli, thanks a lot for running these benchmarks. This is really, really helpful.

Two notes:
- You can try setting "giraph.minFreeMemoryRatio" to something lower than the 
default. 0.1 is probably too conservative.
- I would compare against trunk to better evaluate the impact of this patch.

So far I was only able to run a few benchmarks on a single machine, by setting 
limits on the memory per MapReduce task.
They seem to confirm what you've been saying: when the job fails, it fails at 
the input superstep (GraphMapper#setup()) before it can even make use of the 
WorkerPartitionMap.

I agree, the next step is making this work from the input superstep.
                
> Move part of the graph out-of-core when memory is low
> -----------------------------------------------------
>
>                 Key: GIRAPH-249
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-249
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Alessandro Presta
>            Assignee: Alessandro Presta
>         Attachments: GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch, 
> GIRAPH-249.patch, GIRAPH-249.patch
>
>
> There has been some talk about Giraph's scaling limitations due to keeping 
> the whole graph and messages in RAM.
> We need to investigate methods to fall back to disk when running out of 
> memory, while gracefully degrading performance.
> This issue is for graph storage. Messages should probably be a separate 
> issue, although the interplay between the two is crucial.
> We should also discuss what are our primary goals here: completing a job 
> (albeit slowly) instead of failing when the graph is too big, while still 
> encouraging memory optimizations and high-memory clusters; or restructuring 
> Giraph to be as efficient as possible in disk mode, making it almost a 
> standard way of operating.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to