On Tue, May 10, 2011 at 8:24 AM, Sean Owen <[email protected]> wrote: > I peeked in the examples job jar and it definitely does have this class, > along with the other dependencies (after my patch). Double-check that > you've > done the clean build an "install" again? and maybe even print out > MAHOUT_JOB > in the script to double-check what it is using? >
[jake@smf1-ady-15-sr1 bla]$ jar -tf mahout-examples-0.5-SNAPSHOT-job.jar | grep "/Analyzer.class" org/apache/lucene/analysis/Analyzer.class [swap exec for echo in last line of bin/mahout ] [jake@smf1-ady-15-sr1 mahout-distribution-0.5-SNAPSHOT]$ ./bin/mahout Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20 No HADOOP_CONF_DIR set, using /usr/lib/hadoop-0.20/src/conf /usr/lib/hadoop-0.20/bin/hadoop jar /home/jake/mahout-distribution-0.5-SNAPSHOT/mahout-examples-0.5-SNAPSHOT-job.jar org.apache.mahout.driver.MahoutDriver :\ > On Tue, May 10, 2011 at 12:40 AM, Jake Mannix <[email protected]> > wrote: > > > wah. Even trying to do seq2sparse doesn't work for me: > > > > [jake@smf1-ady-15-sr1 mahout-distribution-0.5-SNAPSHOT]$ ./bin/mahout > > seq2sparse -i hdfs://<namenode>/user/jake/text_temp -o > > hdfs://<namenode>/user/jake/text_vectors_temp > > Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop-0.20 > > No HADOOP_CONF_DIR set, using /usr/lib/hadoop-0.20/src/conf > > 11/05/09 23:36:01 WARN driver.MahoutDriver: No seq2sparse.props found on > > classpath, will use command-line arguments only > > 11/05/09 23:36:01 INFO vectorizer.SparseVectorsFromSequenceFiles: Maximum > > n-gram size is: 1 > > 11/05/09 23:36:01 INFO vectorizer.SparseVectorsFromSequenceFiles: Minimum > > LLR value: 1.0 > > 11/05/09 23:36:01 INFO vectorizer.SparseVectorsFromSequenceFiles: Number > of > > reduce tasks: 1 > > 11/05/09 23:36:04 INFO input.FileInputFormat: Total input paths to > process > > : > > 1 > > 11/05/09 23:36:10 INFO mapred.JobClient: Running job: > > job_201104300433_126621 > > 11/05/09 23:36:12 INFO mapred.JobClient: map 0% reduce 0% > > 11/05/09 23:36:47 INFO mapred.JobClient: Task Id : > > attempt_201104300433_126621_m_000000_0, Status : FAILED > > 11/05/09 23:37:07 INFO mapred.JobClient: Task Id : > > attempt_201104300433_126621_m_000000_1, Status : FAILED > > Error: java.lang.ClassNotFoundException: > > org.apache.lucene.analysis.Analyzer > > > > ---- > > > > Note I'm not specifying any fancy analyzer. Just trying to run with the > > defaults. :\ > > > > -jake >
