documentation lists options in wrong order
------------------------------------------

                 Key: HADOOP-7220
                 URL: https://issues.apache.org/jira/browse/HADOOP-7220
             Project: Hadoop Common
          Issue Type: Bug
            Reporter: Dieter Plaetinck
            Priority: Minor


On http://hadoop.apache.org/common/docs/r0.20.2/streaming.html various example 
use -D flags.

I noticed if you invoke hadoop this way, it won't work.

========================
dplaetin@n-0:/usr/local/hadoop/bin$ ./hadoop jar 
/usr/local/hadoop/contrib/streaming/hadoop-0.20.2-streaming.jar -file 
/proj/Search/wall/experiment/  -mapper './build-models.py --mapper'   -reducer 
'./build-models.py --reducer'   -input sim-input -output sim-output -D 
mapred.output.key.comparator.class=org.apache.hadoop.mapred.lib.KeyFieldBasedComparator
 -D mapred.text.key.comparator.options=-k1,2n 
11/04/12 10:39:28 ERROR streaming.StreamJob: Unrecognized option: -D

Usage: $HADOOP_HOME/bin/hadoop jar \
          $HADOOP_HOME/hadoop-streaming.jar [options]
Options:
  -input    <path>     DFS input file(s) for the Map step
  -output   <path>     DFS output directory for the Reduce step
  -mapper   <cmd|JavaClassName>      The streaming command to run
  -combiner <JavaClassName> Combiner has to be a Java class
  -reducer  <cmd|JavaClassName>      The streaming command to run
  -file     <file>     File/dir to be shipped in the Job jar file
  -inputformat 
TextInputFormat(default)|SequenceFileAsTextInputFormat|JavaClassName Optional.
  -outputformat TextOutputFormat(default)|JavaClassName  Optional.
  -partitioner JavaClassName  Optional.
  -numReduceTasks <num>  Optional.
  -inputreader <spec>  Optional.
  -cmdenv   <n>=<v>    Optional. Pass env.var to streaming commands
  -mapdebug <path>  Optional. To run this script when a map task fails 
  -reducedebug <path>  Optional. To run this script when a reduce task fails 
  -verbose

Generic options supported are
-conf <configuration file>     specify an application configuration file
-D <property=value>            use value for given property
-fs <local|namenode:port>      specify a namenode
-jt <local|jobtracker:port>    specify a job tracker
-files <comma separated list of files>    specify comma separated files to be 
copied to the map reduce cluster
-libjars <comma separated list of jars>    specify comma separated jar files to 
include in the classpath.
-archives <comma separated list of archives>    specify comma separated 
archives to be unarchived on the compute machines.

The general command line syntax is
bin/hadoop command [genericOptions] [commandOptions]


For more details about these options:
Use $HADOOP_HOME/bin/hadoop jar build/hadoop-streaming.jar -info

Streaming Job Failed!
========================


I could only make it work by moving the '-D flags to the front' (right after 
the streaming.jar part).  maybe because it's a generic option, it needs to be 
in front or something.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to