Re: Memory mapped resources

Luke Lu Tue, 12 Apr 2011 12:50:35 -0700

You can use distributed cache for memory mapped files (they're local
to the node the tasks run on.)


http://developer.yahoo.com/hadoop/tutorial/module5.html#auxdata

On Tue, Apr 12, 2011 at 10:40 AM, Benson Margulies
<[email protected]> wrote:
> Here's the OP again.
>
> I want to make it clear that my question here has to do with the
> problem of distributing 'the program' around the cluster, not 'the
> data'. In the case at hand, the issue a system that has a large data
> resource that it needs to do its work. Every instance of the code
> needs the entire model. Not just some blocks or pieces.
>
> Memory mapping is a very attractive tactic for this kind of data
> resource. The data is read-only. Memory-mapping it allows the
> operating system to ensure that only one copy of the thing ends up in
> physical memory.
>
> If we force the model into a conventional file (storable in HDFS) and
> read it into the JVM in a conventional way, then we get as many copies
> in memory as we have JVMs.  On a big machine with a lot of cores, this
> begins to add up.
>
> For people who are running a cluster of relatively conventional
> systems, just putting copies on all the nodes in a conventional place
> is adequate.
>

Re: Memory mapped resources

Reply via email to