[ https://issues.apache.org/jira/browse/HADOOP-10245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877905#comment-13877905 ]
shanyu zhao commented on HADOOP-10245: -------------------------------------- [~ywskycn] Thank you for your comment! If we remove -Xmx512m from HADOOP_CLIENT_OPTS in hadoop_env.cmd, there will be one and only one -Xmx, which is the $JAVA_HEAP_MAX in bin/hadoop. HADOOP-9870 may have solved the problem for you, but I think the fix in HADOOP-9870 might be too complicated and hard to maintain. For example, what about user use "-Xmx" in HADOOP_OPTS instead of HADOOP_CLIENT_OPTS? I think we should avoid using HADOOP_CLIENT_OPTS or HADOOP_OPTS to specify memory, because the fact that we've defined HADOOP_HEAPSIZE but not using it for memory specification is confusing. If you want to change heap size, just change HADOOP_HEAPSIZE, I think this is simple and clear. Thoughts? > Hadoop command line always appends "-Xmx" option twice > ------------------------------------------------------ > > Key: HADOOP-10245 > URL: https://issues.apache.org/jira/browse/HADOOP-10245 > Project: Hadoop Common > Issue Type: Bug > Components: bin > Affects Versions: 2.2.0 > Reporter: shanyu zhao > Assignee: shanyu zhao > Attachments: HADOOP-10245.patch > > > The Hadoop command line scripts (hadoop.sh or hadoop.cmd) will call java with > "-Xmx" options twice. The impact is that any user defined HADOOP_HEAP_SIZE > env variable will take no effect because it is overwritten by the second > "-Xmx" option. > For example, here is the java cmd generated for command "hadoop fs -ls /", > Notice that there are two "-Xmx" options: "-Xmx1000m" and "-Xmx512m" in the > command line: > java -Xmx1000m -Dhadoop.log.dir=C:\tmp\logs -Dhadoop.log.file=hadoop.log > -Dhadoop.root.logger=INFO,c > onsole,DRFA -Xmx512m -Dhadoop.security.logger=INFO,RFAS -classpath XXX > org.apache.hadoop.fs.FsShell -ls / > Here is the root cause: > The call flow is: hadoop.sh calls hadoop_config.sh, which in turn calls > hadoop-env.sh. > In hadoop.sh, the command line is generated by the following pseudo code: > java $JAVA_HEAP_MAX $HADOOP_CLIENT_OPTS -classpath ... > In hadoop-config.sh, $JAVA_HEAP_MAX is initialized as "-Xmx1000m" if user > didn't set $HADOOP_HEAP_SIZE env variable. > In hadoop-env.sh, $HADOOP_CLIENT_OPTS is set as this: > export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS" > To fix this problem, we should remove the "-Xmx512m" from HADOOP_CLIENT_OPTS. > If we really want to change the memory settings we need to use > $HADOOP_HEAP_SIZE env variable. -- This message was sent by Atlassian JIRA (v6.1.5#6160)