[ https://issues.apache.org/jira/browse/MAPREDUCE-3182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968473#comment-13968473 ]
Chen He commented on MAPREDUCE-3182: ------------------------------------ There two GenericLoadGenerator classes in current Hadoop source code. One is under org.apache.hadoop.mapreduce package. It has two documentation problems. Firstly, it does not actually parse the "-m" command line option but still show this option in the "Usage". Secondly, if user does not specify the input directory, it will create input data using RandomWriter with default setting( 10GB per map task and 10 map task per node). However, it does not show this option in the "Usage". The other is under org.apache.hadoop.mapred package; It is an older version of GenericLoadGenerator. It has the second documentation problem described in above paragraph. > loadgen ignores -m command line when writing random data > -------------------------------------------------------- > > Key: MAPREDUCE-3182 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-3182 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2, test > Affects Versions: 0.23.0, 0.24.0, 2.3.0 > Reporter: Jonathan Eagles > Assignee: Chen He > > If no input directories are specified, loadgen goes into a special mode where > random data is generated and written. In that mode, setting the number of > mappers (-m command line option) is overridden by a calculation. Instead, it > should take into consideration the user specified number of mappers and fall > back to the calculation. In addition, update the documentation as well to > match the new behavior in the code. -- This message was sent by Atlassian JIRA (v6.2#6252)