Is the mvn exec commands to run 20-newsgroups example enough?. I havent used the ant for a while(read 8 months), and mahout has shifted to maven anyways
So here goes. In examples directory $ tar zxf 20news-18828.tar.gz $ mkdir 20news-input $ mvn -e exec:java -Dexec.mainClass=org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups -Dexec.args="-p 20news-18828 -o 20news-input -a org.apache.lucene.analysis.standard.StandardAnalyzer -c UTF-8" To Train $ mvn -e exec:java -Dexec.mainClass=org.apache.mahout.classifier.bayes.TrainClassifier -Dexec.args="-i 20news-input -o 20news-model -type cbayes -ng 1 -source hdfs" To Test $ mvn -e exec:java -Dexec.mainClass=org.apache.mahout.classifier.bayes.TestClassifier -Dexec.args="-m 20news-model -d 20news-input -type cbayes -ng 1 -source hdfs -method sequential" On Sun, Feb 7, 2010 at 2:26 PM, Loek Cleophas <[email protected]>wrote: > Hi > > A few weeks ago, after some toiling, I managed to get the input data for > the 20 newsgroups example into the format used by the Bayes classifiers in > Mahout. I did this on the trunk, and remember that it took some tricks in > particular to get the PrepareTwentyNewsgroups code to run on the expanded > data and extract/collapse it into the format used by Mahout's Bayes > classifiers. > > For some reason now beyond me, I removed that copy of the trunk with the > example data. Now, I'm trying to redo the same (albeit this time on release > 0.2), but am having trouble. I copied the maven/build.xml into > examples/build.xml according to a September post on the user group ( > http://old.nabble.com/20-newsgroups-example-td25235941.html). That post > also suggested modifying the file, i.e. taking out the reference classpath > refid="maven.test.classpath"/ (which indeed is not recognized when I run the > extract-20news-18828 ant target), and adding the following lines: > > <classpath> > <path id="lib.path.ref"> > <fileset dir="target" includes="*.jar"/> > </path> > <path id="lib.path.ref"> > <fileset dir="lib" includes="*.jar"/> > </path> > </classpath> > > The "target" one makes some sense, but the lib one does not - I don't see > any lib folder in my mahout-0.2 checkout (even after having done the mvn > install of core and mvn compile of examples). Can anyone (Robin?) tell me > what lines to add instead to get the Ant task to work? I know I managed to > get it working before on my own, but can't remember for the life of me how I > did it :-\ > > Regards, > Loek
