You an set the mapred.child.java.opts on a per job basis
either via -D mapred.child.java.ops="java options" or via
conf.set("mapred.child.java.opts", "java options").
Note: the conf.set must be done before the job is submitted.
On Fri, May 8, 2009 at 11:57 AM, Philip Zeyliger <phi...@cloudera.com>wrote:
You could add "-Xss<n>" to the "mapred.child.java.opts" configuration
setting. That's controlling the Java stack size, which I think is the
relevant bit for you.
That's part of it, but there's also native memory used when you start
a thread with most JREs.
See the lengthy article at
http://www.ibm.com/developerworks/java/library/j-nativememory-linux/index.html
for more details than you probably ever wanted to know :) I haven't
tried the sample code on my EC2 instances, but will try to do so next
week and post results.
In the past, with FC4 & (I think) FC6, we definitely needed to
constrain the OS stack size to avoid running out of native memory
when spawning lots of Java threads.
-- Ken
> <property>
<name>mapred.child.java.opts</name>
<value>-Xmx200m</value>
<description>Java opts for the task tracker child processes.
The following symbol, if present, will be interpolated: @taskid@ is
replaced
by current TaskID. Any other occurrences of '@' will go unchanged.
For example, to enable verbose gc logging to a file named for the taskid
in
/tmp and to set the heap maximum to be a gigabyte, pass a 'value' of:
-Xmx1024m -verbose:gc -Xloggc:/tmp/@tas...@.gc
The configuration variable mapred.child.ulimit can be used to control the
maximum virtual memory of the child processes.
</description>
</property>
On Fri, May 8, 2009 at 11:16 AM, Ken Krugler <kkrugler_li...@transpac.com
>wrote:
> Hi there,
>
> For a very specific type of reduce task, we currently need to use a large
> number of threads.
>
> To avoid running out of memory, I'd like to constrain the Linux stack
size
> via a "ulimit -s xxx" shell script command before starting up the JVM. I
> could do this for the entire system at boot time, but it would be better
to
> have it for just the Hadoop JVM(s).
>
> Any suggestions for how best to handle this?
>
> Thanks,
>
> > -- Ken
--
Ken Krugler
+1 530-210-6378