[ https://issues.apache.org/jira/browse/MAHOUT-1034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13409410#comment-13409410 ]
jayghost commented on MAHOUT-1034: ---------------------------------- Hi, Leting Wu, did you solve the problem? I meet the some error as yours. I use Hadoop1.0.1 and Mahout0.7 in Ubuntu1204 as namenode and 2 Ubuntu1010 as datanodes. I executed the classify-20newsgroups.sh step by step. It failed when I go to step "./bin/mahout trainnb", the same error info as yours. I must run in a hadoop cluster environment. I think it's the 'trainnb' command issue. Any body help me? {adoop@master:~$ cd program/mahout-distribution-0.7/ hadoop@master:~/program/mahout-distribution-0.7$ bin/mahout trainnb -i ~/Downloads/20news-bydate/20news-bydate-test-vectors/tfidf-vectors -el -o ~/Downloads/20news-bydate/model -li ~/Downloads/20news-bydate/labelindex -ow MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Warning: $HADOOP_HOME is deprecated. Running on hadoop, using /home/hadoop/program/hadoop-1.0.1/bin/hadoop and HADOOP_CONF_DIR=/home/hadoop/program/hadoop-1.0.1/conf MAHOUT-JOB: /home/hadoop/program/mahout-distribution-0.7/mahout-examples-0.7-job.jar Warning: $HADOOP_HOME is deprecated. 12/07/09 20:43:56 WARN driver.MahoutDriver: No trainnb.props found on classpath, will use command-line arguments only 12/07/09 20:43:57 INFO common.AbstractJob: Command line arguments: {--alphaI=[1.0], --endPhase=[2147483647], --extractLabels=null, --input=[/home/hadoop/Downloads/20news-bydate/20news-bydate-test-vectors/tfidf-vectors], --labelIndex=[/home/hadoop/Downloads/20news-bydate/labelindex], --output=[/home/hadoop/Downloads/20news-bydate/model], --overwrite=null, --startPhase=[0], --tempDir=[temp]} 12/07/09 20:44:03 INFO common.HadoopUtil: Deleting temp ****/home/hadoop/Downloads/20news-bydate/20news-bydate-test-vectors/tfidf-vectors 12/07/09 20:44:19 INFO input.FileInputFormat: Total input paths to process : 1 12/07/09 20:44:21 INFO mapred.JobClient: Running job: job_201207092040_0001 12/07/09 20:44:22 INFO mapred.JobClient: map 0% reduce 0% 12/07/09 20:46:55 INFO mapred.JobClient: map 100% reduce 0% 12/07/09 20:47:36 INFO mapred.JobClient: map 100% reduce 100% 12/07/09 20:47:41 INFO mapred.JobClient: Job complete: job_201207092040_0001 12/07/09 20:47:41 INFO mapred.JobClient: Counters: 29 12/07/09 20:47:41 INFO mapred.JobClient: Job Counters 12/07/09 20:47:41 INFO mapred.JobClient: Launched reduce tasks=1 12/07/09 20:47:41 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=63099 12/07/09 20:47:41 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/07/09 20:47:41 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/07/09 20:47:41 INFO mapred.JobClient: Launched map tasks=1 12/07/09 20:47:41 INFO mapred.JobClient: Data-local map tasks=1 12/07/09 20:47:41 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=38314 12/07/09 20:47:41 INFO mapred.JobClient: File Output Format Counters 12/07/09 20:47:41 INFO mapred.JobClient: Bytes Written=97 12/07/09 20:47:41 INFO mapred.JobClient: FileSystemCounters 12/07/09 20:47:41 INFO mapred.JobClient: FILE_BYTES_READ=22 12/07/09 20:47:41 INFO mapred.JobClient: HDFS_BYTES_READ=348 12/07/09 20:47:41 INFO mapred.JobClient: FILE_BYTES_WRITTEN=45839 12/07/09 20:47:41 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=97 12/07/09 20:47:41 INFO mapred.JobClient: File Input Format Counters 12/07/09 20:47:41 INFO mapred.JobClient: Bytes Read=90 12/07/09 20:47:41 INFO mapred.JobClient: Map-Reduce Framework 12/07/09 20:47:41 INFO mapred.JobClient: Map output materialized bytes=14 12/07/09 20:47:41 INFO mapred.JobClient: Map input records=0 12/07/09 20:47:41 INFO mapred.JobClient: Reduce shuffle bytes=14 12/07/09 20:47:41 INFO mapred.JobClient: Spilled Records=0 12/07/09 20:47:41 INFO mapred.JobClient: Map output bytes=0 12/07/09 20:47:41 INFO mapred.JobClient: CPU time spent (ms)=7210 12/07/09 20:47:41 INFO mapred.JobClient: Total committed heap usage (bytes)=207880192 12/07/09 20:47:41 INFO mapred.JobClient: Combine input records=0 12/07/09 20:47:41 INFO mapred.JobClient: SPLIT_RAW_BYTES=173 12/07/09 20:47:41 INFO mapred.JobClient: Reduce input records=0 12/07/09 20:47:41 INFO mapred.JobClient: Reduce input groups=0 12/07/09 20:47:41 INFO mapred.JobClient: Combine output records=0 12/07/09 20:47:41 INFO mapred.JobClient: Physical memory (bytes) snapshot=178298880 12/07/09 20:47:41 INFO mapred.JobClient: Reduce output records=0 12/07/09 20:47:41 INFO mapred.JobClient: Virtual memory (bytes) snapshot=752775168 12/07/09 20:47:41 INFO mapred.JobClient: Map output records=0 ****temp/summedObservations 12/07/09 20:47:51 INFO input.FileInputFormat: Total input paths to process : 1 12/07/09 20:47:52 INFO mapred.JobClient: Running job: job_201207092040_0002 12/07/09 20:47:53 INFO mapred.JobClient: map 0% reduce 0% 12/07/09 20:49:14 INFO mapred.JobClient: Task Id : attempt_201207092040_0002_m_000000_0, Status : FAILED java.lang.IllegalArgumentException at com.google.common.base.Preconditions.checkArgument(Preconditions.java:72) at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) 12/07/09 20:50:19 INFO mapred.JobClient: Task Id : attempt_201207092040_0002_m_000000_1, Status : FAILED java.lang.IllegalArgumentException at com.google.common.base.Preconditions.checkArgument(Preconditions.java:72) at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) 12/07/09 20:50:37 INFO mapred.JobClient: Task Id : attempt_201207092040_0002_m_000000_2, Status : FAILED java.lang.IllegalArgumentException at com.google.common.base.Preconditions.checkArgument(Preconditions.java:72) at org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093) at org.apache.hadoop.mapred.Child.main(Child.java:249) attempt_201207092040_0002_m_000000_2: log4j:WARN No appenders could be found for logger (org.apache.hadoop.mapred.Task). attempt_201207092040_0002_m_000000_2: log4j:WARN Please initialize the log4j system properly. 12/07/09 20:51:09 INFO mapred.JobClient: Job complete: job_201207092040_0002 12/07/09 20:51:09 INFO mapred.JobClient: Counters: 7 12/07/09 20:51:09 INFO mapred.JobClient: Job Counters 12/07/09 20:51:09 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=119359 12/07/09 20:51:09 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 12/07/09 20:51:09 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 12/07/09 20:51:09 INFO mapred.JobClient: Launched map tasks=4 12/07/09 20:51:09 INFO mapred.JobClient: Data-local map tasks=4 12/07/09 20:51:09 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0 12/07/09 20:51:09 INFO mapred.JobClient: Failed map tasks=1 12/07/09 20:51:09 INFO driver.MahoutDriver: Program took 433140 ms (Minutes: 7.219)} Thanks! > ERROR in Navie Bayes Training(trainnb) > -------------------------------------- > > Key: MAHOUT-1034 > URL: https://issues.apache.org/jira/browse/MAHOUT-1034 > Project: Mahout > Issue Type: Bug > Components: Classification > Affects Versions: 0.7 > Environment: Ubuntu 11.04 > Reporter: Leting Wu > Priority: Critical > > When run either examples/classify-20newsgrouops.sh or ash-email-examples.sh, > trainnb always fails: > {noformat} > INFO mapred.JobClient: Task Id : attempt_201206281546_0003_m_000000_0, Status > : FAILED > java.lang.IllegalArgumentException > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:72) > at > org.apache.mahout.classifier.naivebayes.training.WeightsMapper.setup(WeightsMapper.java:42) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > at org.apache.hadoop.mapred.Child$4.run(Child.java:270) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177) > at org.apache.hadoop.mapred.Child.main(Child.java:264) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira