
on page http://hadoop.apache.org/common/docs/r0.20.0/cluster_setup.html

there is a following instructions:
"For example, To configure Namenode to use parallelGC, the following
statement should be added in hadoop-env.sh:

Basically that's fine. But since hadoop-env.sh is sourced a few times (by
other scripts) while starting cluster with "start-all.sh", the
process looks even without any additional configs (please notice
multiple -Dcom.sun.management.jmxremote
options) like this:

hdfs     27039     1  0 12:56 pts/0    00:00:03 /usr/local/java/bin/java
-Dproc_namenode -Xmx1000m -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote
-Dcom.sun.management.jmxremote -Dhadoop.log.dir=/logs/hadoop/logs... etc

And if one adds bunch of some configs for each service, the process list
begins to be quite lengthy...

So, is there any particular reason to not to leave out ${HADOOP_NAMENODE_OPTS}
from above example and default hadoop-env.sh which comes with vanilla and
Cloudera's hadoop?
Line would look like this in above example:
export HADOOP_NAMENODE_OPTS="-XX:+UseParallelGC"

and in default hadoop-env.sh file:

Same issue with these as well:
export HADOOP_SECONDARYNAMENODE_OPTS="-Dcom.sun.management.jmxremote
export HADOOP_DATANODE_OPTS="-Dcom.sun.management.jmxremote
export HADOOP_BALANCER_OPTS="-Dcom.sun.management.jmxremote
export HADOOP_JOBTRACKER_OPTS="-Dcom.sun.management.jmxremote

Or am I missing some point here? :)

br, Ossi

