Hi,
Those API hooks are called once per task attempt, and regardless of
JVM reuse they will still be run once per task attempt. So yes,
setup+cleanup for every map split or reduce partition that runs
through the reused JVM.
On Tue, Nov 13, 2012 at 1:47 PM, edward choi mp2...@gmail.com wrote:
Hi
It seems like when JVM reuse is enabled map task log data is not getting
written to their corresponding log files; log data from certain map tasks
gets appended to log files corresponding to some other map task.
For example, I have a case here where 8 map JVMs are running simultaneously
and all
Hi Shrinivas,
Yes, this is the behavior of the task logs when using JVM Reuse. You should
notice in the log directories for the other tasks a log index file which
specifies the byte offsets into the log files where the task starts and
stops. When viewing logs through the web UI, it will use
I've encountered an issue where I have JVM reuse turned on (via setting in the
Job configuration, not the JobTracker), and when I submit another job
immediately after the first one finishes, it takes several seconds before any
Map tasks begin. In looking at an individual Task node, it appears
Hello,
I have set mapred.job.reuse.jvm.num.tasks to -1 for re-using the JVM.
My intention is to run a helper program at the beginning of the job and then
feed the key/value pairs
from the tasks to the helper program.
Currently am running it in the call to setup below.
If JVM Task re-use is -1,
Hi All,
Regarding the JVM reuse feature incorporated, it says reuse is generally
recommended for streaming and pipes jobs. I'm a little unclear on this and any
pointers will be appreciated.
Also, in what scenarios will this feature be helpful for java mapred jobs?
Thanks,
Amogh
- The Definitive Guide
*
--
Thanks Regards,
Chandra Prakash Bhagtani
On Tue, Sep 15, 2009 at 12:30 PM, Amogh Vasekar am...@yahoo-inc.com wrote:
Hi All,
Regarding the JVM reuse feature incorporated, it says reuse is generally
recommended for streaming and pipes jobs. I'm a little unclear
I think simply because it was a new feature, and it really only helps for
jobs where there are a large number of tasks compared to the available task
slots, coupled with the concern that the subsequent tasks run in the jvm may
not run identically to running in a fresh jvm.
On Fri, Aug 21, 2009 at