Re: Sharing an object across mappers

Owen O'Malley Fri, 03 Oct 2008 09:30:22 -0700


On Oct 3, 2008, at 7:49 AM, Devajyoti Sarkar wrote:

Briefly going through the DistributedCache information, it seems tobe a way
to distribute files to mappers/reducers.


Sure, but it handles the distribution problem for you.

One still needs to read the
contents into each map/reduce task VM.

If the data is straight binary data, you could just mmap it from thevarious tasks. It would be pretty efficient.

The other direction is to use the MultiThreadedMapRunner and runmultiple maps as threads in the same VM. But unless your maps are CPUheavy or contacting external servers, it probably won't help as muchas you'd like.


-- Owen

Re: Sharing an object across mappers

Reply via email to