You can cache the block in your task, in a pinned static variable, when you
are reusing the jvms.

On Sun, May 10, 2009 at 2:30 PM, Matt Bowyer <>wrote:

> Hi,
> I am trying to do 'on demand map reduce' - something which will return in
> reasonable time (a few seconds).
> My dataset is relatively small and can fit into my datanode's memory. Is it
> possible to keep a block in the datanode's memory so on the next job the
> response will be much quicker? The majority of the time spent during the
> job
> run appears to be during the 'HDFS_BYTES_READ' part of the job. I have
> tried
> using the setNumTasksToExecutePerJvm but the block still seems to be
> cleared
> from memory after the job.
> thanks!

Alpha Chapters of my book on Hadoop are available a community for Hadoop Professionals

Reply via email to