[ https://issues.apache.org/jira/browse/HADOOP-6664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12859044#action_12859044 ]
Edward Capriolo commented on HADOOP-6664: ----------------------------------------- If I understandard correctly the docs for current are based on current stable 0.20.2. Current stable does not use fs.inmemory.size.mb. http://hadoop.apache.org/common/docs/current/cluster_setup.html. Under real world configurations {noformat} conf/core-site.xml fs.inmemory.size.mb 200 Larger amount of memory allocated for the in-memory file-system used to merge map-outputs at the reduces. {noformat} As to "io.sort.factor and io.sort.mb" They both appear in mapred-default.xml {noformat} [edw...@ec src]$ grep -R "io.sort.factor" */*.xml mapred/mapred-default.xml: <name>io.sort.factor</name> {noformat} They should be in core-default.xml (only), or in both core-default.xml and mapred-default.conf. Think about the end user. An end user might read a blog that states, "io.sort.factor is a magic tune set this to XXXX for awesome performance". Which file should end user put this variable in? {noformat} grep -R "io.sort.factor" */*.xml mapred/mapred-default.xml: <name>io.sort.factor</name> {noformat} End user thinks, "Since I found this variable in mapred-default.xml it makese sense that I should override it in mapred-site.xml" The user puts the variable in the wrong place, because end user has no (easy) way of knowing that SequenceFile uses io.sort.factor or io.sort.mb. Does that make sense? > fs.inmemory.size.mb not listed in conf. Cluster setup page gives wrong advice. > ------------------------------------------------------------------------------ > > Key: HADOOP-6664 > URL: https://issues.apache.org/jira/browse/HADOOP-6664 > Project: Hadoop Common > Issue Type: Task > Components: conf, documentation > Affects Versions: 0.20.2 > Reporter: Edward Capriolo > > http://hadoop.apache.org/common/docs/current/cluster_setup.html > fs.inmemory.size.mb does not appear in any xml file > {noformat} > grep "fs.inmemory.size.mb" ./mapred/mapred-default.xml > [edw...@ec src]$ grep "fs.inmemory.size.mb" ./hdfs/hdfs-default.xml > [edw...@ec src]$ grep "fs.inmemory.size.mb" ./core/core-default.xml > {noformat} > http://hadoop.apache.org/common/docs/current/cluster_setup.html > Documentation error: > Real-World Cluster Configurations > {noformat} > conf/core-site.xml io.sort.factor 100 More streams merged at > once while sorting files. > conf/core-site.xml io.sort.mb 200 Higher memory-limit while > sorting data. > {noformat} > core --- io.sort.factor -- should be > mapred > core --- io.sort.mb -- should be mapred -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.