Distributed cache Design

Bhupesh Bansal Thu, 16 Oct 2008 13:58:12 -0700

Hey guys,


We at Linkedin are trying to run some Large Graph Analysis problems on
Hadoop. The fastest way to run would be to keep a copy of whole Graph in RAM
at all mappers. (Graph size is about 8G in RAM) we have cluster of 8-cores
machine with 8G on each.

Whats is the best way of doing that ?? Is there a way so that multiple
mappers on same machine can access a RAM cache ??  I read about hadoop
distributed cache looks like it's copies the file (hdfs / http) locally on
the slaves but not necessrily in RAM ??

Best
Bhupesh

Distributed cache Design

Reply via email to