[ https://issues.apache.org/jira/browse/MAHOUT-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409560#comment-13409560 ]
jayghost edited comment on MAHOUT-1034 at 7/9/12 3:22 PM: ---------------------------------------------------------- I try to use -D numLabels=500 as Generic Options, but it shows another error. {hadoop@master:~/program/mahout-distribution-0.7$ bin/mahout trainnb -D numLabels=5000 -i ~/Downloads/20news-bydate/20news-bydate-train-vectors/tfidf-vectors -o ~/Downloads/20news-bydate/model/ -el -li ~/Downloads/20news-bydate/labelindex -owMAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Warning: $HADOOP_HOME is deprecated. Running on hadoop, using /home/hadoop/program/hadoop-1.0.1/bin/hadoop and HADOOP_CONF_DIR=/home/hadoop/program/hadoop-1.0.1/conf MAHOUT-JOB: /home/hadoop/program/mahout-distribution-0.7/mahout-examples-0.7-job.jar Warning: $HADOOP_HOME is deprecated. 12/07/09 23:18:27 WARN driver.MahoutDriver: No trainnb.props found on classpath, will use command-line arguments only 12/07/09 23:18:27 ERROR common.AbstractJob: Unexpected /home/hadoop/Downloads/20news-bydate/model/ while processing Job-Specific Options: usage: <command> [Generic Options] [Job-Specific Options] Generic Options: -archives <paths> comma separated archives to be unarchived on the compute machines. -conf <configuration file> specify an application configuration file -D <property=value> use value for given property -files <paths> comma separated files to be copied to the map reduce cluster -fs <local|namenode:port> specify a namenode -jt <local|jobtracker:port> specify a job tracker -libjars <paths> comma separated jar files to include in the classpath. -tokenCacheFile <tokensFile> name of the file with the tokens Unexpected /home/hadoop/Downloads/20news-bydate/model/ while processing Job-Specific Options: Usage: [--input <input> --output <output> --labels <labels> --extractLabels --alphaI <alphaI> --trainComplementary --labelIndex <labelIndex> --overwrite --help --tempDir <tempDir> --startPhase <startPhase> --endPhase <endPhase>] Job-Specific Options: --input (-i) input Path to job input directory. --output (-o) output The directory pathname for output. --labels (-l) labels comma-separated list of labels to include in training --extractLabels (-el) Extract the labels from the input --alphaI (-a) alphaI smoothing parameter --trainComplementary (-c) train complementary? --labelIndex (-li) labelIndex The path to store the label index in --overwrite (-ow) If present, overwrite the output directory before running job --help (-h) Print out help --tempDir tempDir Intermediate output directory --startPhase startPhase First phase to run --endPhase endPhase Last phase to run 12/07/09 23:18:27 INFO driver.MahoutDriver: Program took 436 ms (Minutes: 0.007266666666666667)} How can I add the numLabels optition? Help pls!!! Thanks! was (Author: jayghost): I try to use -D numLabels=500 as Generic Options, but it shows another error. {hadoop@master:~/program/mahout-distribution-0.7$ bin/mahout trainnb -D numLabels=5000 -i ~/Downloads/20news-bydate/20news-bydate-train-vectors/tfidf-vectors -o ~/Downloads/20news-bydate/model -el -li ~/Downloads/20news-bydate/labelindex -owMAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Warning: $HADOOP_HOME is deprecated. Running on hadoop, using /home/hadoop/program/hadoop-1.0.1/bin/hadoop and HADOOP_CONF_DIR=/home/hadoop/program/hadoop-1.0.1/conf MAHOUT-JOB: /home/hadoop/program/mahout-distribution-0.7/mahout-examples-0.7-job.jar Warning: $HADOOP_HOME is deprecated. 12/07/09 23:18:27 WARN driver.MahoutDriver: No trainnb.props found on classpath, will use command-line arguments only 12/07/09 23:18:27 ERROR common.AbstractJob: Unexpected /home/hadoop/Downloads/20news-bydate/model while processing Job-Specific Options: usage: <command> [Generic Options] [Job-Specific Options] Generic Options: -archives <paths> comma separated archives to be unarchived on the compute machines. -conf <configuration file> specify an application configuration file -D <property=value> use value for given property -files <paths> comma separated files to be copied to the map reduce cluster -fs <local|namenode:port> specify a namenode -jt <local|jobtracker:port> specify a job tracker -libjars <paths> comma separated jar files to include in the classpath. -tokenCacheFile <tokensFile> name of the file with the tokens Unexpected /home/hadoop/Downloads/20news-bydate/model/ while processing Job-Specific Options: Usage: [--input <input> --output <output> --labels <labels> --extractLabels --alphaI <alphaI> --trainComplementary --labelIndex <labelIndex> --overwrite --help --tempDir <tempDir> --startPhase <startPhase> --endPhase <endPhase>] Job-Specific Options: --input (-i) input Path to job input directory. --output (-o) output The directory pathname for output. --labels (-l) labels comma-separated list of labels to include in training --extractLabels (-el) Extract the labels from the input --alphaI (-a) alphaI smoothing parameter --trainComplementary (-c) train complementary? --labelIndex (-li) labelIndex The path to store the label index in --overwrite (-ow) If present, overwrite the output directory before running job --help (-h) Print out help --tempDir tempDir Intermediate output directory --startPhase startPhase First phase to run --endPhase endPhase Last phase to run 12/07/09 23:18:27 INFO driver.MahoutDriver: Program took 436 ms (Minutes: 0.007266666666666667) } How can I add the numLabels optition? Help pls!!! Thanks! > ERROR in Navie Bayes Training(trainnb) > -------------------------------------- > > Key: MAHOUT-1034 > URL: https://issues.apache.org/jira/browse/MAHOUT-1034 > Project: Mahout > Issue Type: Bug > Components: Classification > Affects Versions: 0.7 > Environment: Ubuntu 11.04 > Reporter: Leting Wu > Priority: Critical > > When run either examples/classify-20newsgrouops.sh or ash-email-examples.sh, > trainnb always fails: > {noformat} > INFO mapred.JobClient: Task Id : attempt_201206281546_0003_m_000000_0, Status > : FAILED > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:72) > at > org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira